We can use IP Anonymization to circumvent the need for providing GA Data in the event of the GDPR Subject Access Request. Do we loose anything by doing this as far as GA features go? Why is it not the default?
This article does a pretty good job of answering your question. Basically you will just be sacrificing some precision in your geographic reporting.
Related
After some experimenting, I noticed it is possible to send events directly to a server container via HTTP request instead of pushing to the data layer (which is connected to a web container). A big advantage of this setup is that the front-end doesn't need to load any GTM script. Yet, I have some doubts because I don't find much documentation about this setup. This setup also brings some challenges like implementing automatically collected events (e.g. page_view). Does anyone have experience with this setup or is able to tell me why I shouldn't be following this path?
Regards, Thomas
This is definitely not a best practice, although this is actually a technically more beneficial path since... A few things, actually:
Can make your tracking completely immune to adblockers.
Has the potential to protect from malicious analytics spam, also makes it way harder for third parties to spoil your data.
Doesn't surface your analytics stack and libraries to the public.
Is typically way lighter than the GTM lib.
You have a much better degree of control about what happens and have much more power over the tracking.
But this is only if you have the competency to develop it, which is a rarity, actually. Normally web-developers don't know analytics well enough to make it work well while analytics developers lack the technical knowledge. You now suddenly can't just hire a junior or mid implementation expert to help with the tracking. A lot of those who call themselves seniors wouldn't be able to maintain raw JS tracking libraries either.
As you've mentioned, you won't be able to rely on automatic tracking from GTM or gtag libraries. And not having automatic events is actually not the issue. The more important thing is manually collecting all dimensions, including the proper maintenance of client ids and session ids.
Once your front-end is ready, it's important to note that you don't want to expose your server-side GTM's endpoint. I mean, you can, but this would defeat the purpose significantly. You want to make a mirror on your backend that would reroute the events to the sGTM.
Finally, you may want to make up some kind of data encryption/protection/validation/authentication logic on your mirror for the data. You may consider it just because without surfacing the endpoints, you're now able to further conceal what you're doing thus avoiding much of potential data tampering. This won't make it impossible to look into what you're doing, of course, but it will make nearly impossible any casual interference.
In the end, people don't do it because this would effectively double the monetary cost of tracking since sufficient experts would charge approximately double from what regular analytics folks charge. However, the clarity of data will only grow about 10-20%. Such an exchange generally doesn't make business sense unless you're a huge corporation for which even enterprise analytics solutions like Adobe Analytics is not good enough. Amazon would probably be a good example.
Also, If you're already redefining users and sessions, you're not that far from using something like Segment for tracking and then ETLing all that into a data warehouse and use a proper BI tool for further analysis. And now is there still sense in having the sGTM at all if you can just stream your events to Segment realtime from your mirror, and then it can seamlessly re-integrate this data into GA, Firebase, AA, Snowflake, Facebook and tens if not hundreds more destinations, and this all server-side.
You want to know where to stop, and the best way to do it is by assessing the depth of the analysis/data science your company is conducting on the user behavioral data. And in 99% of cases, it's not deep enough to even consider sGTM.
In response to #BNazaruk
So it's been a while now… I've been looking into the setup, because it’s just way too cool. I also took a deeper dive into CGTM to better understand the benefits of SGTM. And honestly, everything that has the probability to replace CGTM should be considered. My main reasons are;
Cybersecurity - Through injection it is possible to insert malicious software like keyloggers. The only thing that withholds this, are the login details to CGTM. These are, relatively speaking easy to get with targeted phishing.
Speed - A CGTM setup, with about 10 - 15 tags, means an avg performance loss of 40 points in Lighthouse.
Quality - Like you said; because browser restrictions like cookie policies and ad blockers that intercept/manipulate/block CGTM signals: On avg. 10-20% of the events are not registered in proper fashion.
Mistakes - Developing code outside a proper dev process, limits the insight into the impact of the code with possible errors or performance loss as a result.
So far I have created a standardized setup (container templates, measurement plans, libraries) for online marketers and developers to use. Within the setup, we maintain our own client and session ID’s. Developers are able to make optimal use of SGTM and increase productivity drastically. The only downside to the setup is that we still use CGTM to implement page_view and exceptions. Which is a shame, because I’m not far away from a full server-to-server setup. Companies are still too skeptical to fully commit to SGTM I guess. Though, my feeling says that in 5 years time, high-end apps won't use CGTM anymore.
Once again, thanks for your answer, it’s been an important part of my journey.
I am working on a project to remove gtag based tracking from a website. We have concerns about not being able to implement tracking due to poor network connections. The ideal solution would be to handle conversion tracking on the backend when we receive a purchase request (so as not to send too many requests.) It seems that Google Analytics Measurement Protocol would be the correct tool for the job. However, this (somewhat frustrating) note mentions that "only partial reporting may be available."
We have explored server side GTM, but that still adds unwanted network requests and is not the approach we want to go for. It seems that there are workarounds such as in this question, but they seem pretty precarious. Is there any other api available, or approach that might fit the use case we are looking for?
I'm searching for a way to access the GA Attribution (beta) project data automatically.
Especially I need this table (screenshot 1).
I want to extract it automatically every day, but I cant seem to find any API for that.
Is there a way to get this data automatically?
I thought about using a webscraper, but I'm not sure if that's allowd/possible on GA.
Thank you in advance.
As off 22.02.2022:
Thanks for contacting Google Analytics 360 support team. I hope you
are doing good!
I have reviewed your chat and understand that you are looking for
option of downloading Model comparison table using API.
I would like to inform that we have already received several request
from clients on this and our engineers are in the process of reviewing
the request (feature request). I would like to inform you that I have
added your details to the list of interested clients and I am looking
forward to it.
Please note that feature request implementation completely depends on
our engineers so it is really difficult to provide an estimated
timeline but I am positively looking forward for the same. If you
still have anymore questions on the same you may reach out to your
partner manager and discuss further.
I hope this email helps, in case you have any further questions please
feel free to reply and I will be glad to assist further.
I'm trying to get a better understanding of how Google docs makes and handles requests for changes in documents on the fly, and am wondering if there is something specific I could be monitoring with Wireshark.
Thank you.
Google Docs uses HTTPS to encrypt its communication. You cannot monitor any specific content-related changes, only witness that a user is contacting docs.google.com.
That said, if you're willing to delve into the realm of timing analysis, you may be able to reconstruct details from encrypted sessions. But this requires a significant amount of manual work.
What i'm wondering is, what kind of behaviour does google analytics show when a ddos attack occurs? Any theories?
My theory would be that an effective DDoS platform/script would not include anything as heavyweight as a JavaScript engine, and that therefore the DDoS activity would not show up in Google Analytics at all.
The point of a DDoS attack is to overwhelm the server with a flood of requests. Any CPU cycles that are spent evaluating JavaScript in the response that the server sends back are cycles that could better be used churning out more requests to the server. I would fully expect a properly executed DDoS attack to not waste time parsing the response from the server, or even reading it off of the underlying socket, let alone interpreting and executing and JavaScript that may be embedded in the markup or fetching scripts and other resources from domains other than the target server.
Of course, this does not preclude the possibility of an exceptionally naive DDoS attack implemented using web frameworks and libraries that do evaluate embedded JavaScript. Such an attack would not (or rather, should not if you've implemented your server code correctly) be very effective, but it would likely generate a spike in Google Analytics traffic.
It depends on the way that the DDOS is implemented. If it's simply an executable distributed to multiple machines, making simple HTTP queries using native TCP sockets, then Google Analytics wouldn't notice anything at all: because the JavaScript that gets returned would never be executed.
However, other sorts of DDOS attacks could leverage actual browsers distributed across many machines. For instance, if you could hack the Yahoo home page and insert an <iframe src='takemedown.com'> into it, you could easily DDOS "takemedown.com". In this particular scenario, GA would certainly detect the impressions, and because (depending on the scenario) there might be an HTTP referrer tag, you could possibly run a report in GA that could pull out the suspicious impressions.
But there are other similar scenarios that wouldn't leave any particular footprints. For instance, if you could hack Lady Gaga's twitter account, you could send out a link to her 16MM followers, and a significant number would immediately click on it: and since most of those clicking on it would probably be doing so from within a separate app, there wouldn't be any referrer tag, and no particular way of identifying the requests.
In other words, it all depends, but it's probably not a terribly useful avenue to investigate. In many (most?) scenarios, GA wouldn't even recognize the impression; and in many others, wouldn't have any reasonable way of picking out the good impressions from the bad.
It will show up 100% some significant peaks in google analytics , simply because there are huge number of requests from multiple sources having huge bounce rate !
When a HTTP DDoS attack occurs the attacker is either using several (thousands) of computers to do so. Sometimes, it's also done with servers. When they make the request, they don't render the javascript or anything - they simply in most cases just make a GET request to the webpage.
So no, it shouldn't really have an impact on GoogleAnalytics
Well, I'm also searching this kind of information, but I have some considerations about the answer:
You will probably not see the attack itself with Google analytics, but you should see the results, I mean, a DDoS is a "distributed deny of service", so, if the service is effectively denied, then you should see a flat line on the graph on Google analytics.
It depends how the bot works, but here's what happened to my website:
Google Analytics real time report for the monk
As well as the increase in traffic you will likely see your bounce rate go sky high and average time on page significantly drop - which I'm sure can have a negative impact on SERPS.
For me it coincided with a Google update so first I put it down to that, but I started getting a lot of traffic to the root page, terms, and privacy, with many prefixed with /?m=0 which is in itself odd (and I'd love for someone to shed light).
The attack caused a great deal of timeouts and was painful to fix:
In short, I hooked up CloudFlare, then created Security -> WAF rules to challenge countries where I was receiving most of the bot traffic. I also switched on the basic bot attack mode (there's a more effective super bot attack mode with the paid subscriptions).
The other interesting point of note was why was my site subject to a DDOS attack. I wish I knew, but at a similar time to when the attack started I was approached by someone who enquired about buying the website. Possibly a tactic to get me to sell it/sell it cheap.