Send event directly to server container via HTTP request instead of web container - google-tag-manager

After some experimenting, I noticed it is possible to send events directly to a server container via HTTP request instead of pushing to the data layer (which is connected to a web container). A big advantage of this setup is that the front-end doesn't need to load any GTM script. Yet, I have some doubts because I don't find much documentation about this setup. This setup also brings some challenges like implementing automatically collected events (e.g. page_view). Does anyone have experience with this setup or is able to tell me why I shouldn't be following this path?
Regards, Thomas

This is definitely not a best practice, although this is actually a technically more beneficial path since... A few things, actually:
Can make your tracking completely immune to adblockers.
Has the potential to protect from malicious analytics spam, also makes it way harder for third parties to spoil your data.
Doesn't surface your analytics stack and libraries to the public.
Is typically way lighter than the GTM lib.
You have a much better degree of control about what happens and have much more power over the tracking.
But this is only if you have the competency to develop it, which is a rarity, actually. Normally web-developers don't know analytics well enough to make it work well while analytics developers lack the technical knowledge. You now suddenly can't just hire a junior or mid implementation expert to help with the tracking. A lot of those who call themselves seniors wouldn't be able to maintain raw JS tracking libraries either.
As you've mentioned, you won't be able to rely on automatic tracking from GTM or gtag libraries. And not having automatic events is actually not the issue. The more important thing is manually collecting all dimensions, including the proper maintenance of client ids and session ids.
Once your front-end is ready, it's important to note that you don't want to expose your server-side GTM's endpoint. I mean, you can, but this would defeat the purpose significantly. You want to make a mirror on your backend that would reroute the events to the sGTM.
Finally, you may want to make up some kind of data encryption/protection/validation/authentication logic on your mirror for the data. You may consider it just because without surfacing the endpoints, you're now able to further conceal what you're doing thus avoiding much of potential data tampering. This won't make it impossible to look into what you're doing, of course, but it will make nearly impossible any casual interference.
In the end, people don't do it because this would effectively double the monetary cost of tracking since sufficient experts would charge approximately double from what regular analytics folks charge. However, the clarity of data will only grow about 10-20%. Such an exchange generally doesn't make business sense unless you're a huge corporation for which even enterprise analytics solutions like Adobe Analytics is not good enough. Amazon would probably be a good example.
Also, If you're already redefining users and sessions, you're not that far from using something like Segment for tracking and then ETLing all that into a data warehouse and use a proper BI tool for further analysis. And now is there still sense in having the sGTM at all if you can just stream your events to Segment realtime from your mirror, and then it can seamlessly re-integrate this data into GA, Firebase, AA, Snowflake, Facebook and tens if not hundreds more destinations, and this all server-side.
You want to know where to stop, and the best way to do it is by assessing the depth of the analysis/data science your company is conducting on the user behavioral data. And in 99% of cases, it's not deep enough to even consider sGTM.

In response to #BNazaruk
So it's been a while now… I've been looking into the setup, because it’s just way too cool. I also took a deeper dive into CGTM to better understand the benefits of SGTM. And honestly, everything that has the probability to replace CGTM should be considered. My main reasons are;
Cybersecurity - Through injection it is possible to insert malicious software like keyloggers. The only thing that withholds this, are the login details to CGTM. These are, relatively speaking easy to get with targeted phishing.
Speed - A CGTM setup, with about 10 - 15 tags, means an avg performance loss of 40 points in Lighthouse.
Quality - Like you said; because browser restrictions like cookie policies and ad blockers that intercept/manipulate/block CGTM signals: On avg. 10-20% of the events are not registered in proper fashion.
Mistakes - Developing code outside a proper dev process, limits the insight into the impact of the code with possible errors or performance loss as a result.
So far I have created a standardized setup (container templates, measurement plans, libraries) for online marketers and developers to use. Within the setup, we maintain our own client and session ID’s. Developers are able to make optimal use of SGTM and increase productivity drastically. The only downside to the setup is that we still use CGTM to implement page_view and exceptions. Which is a shame, because I’m not far away from a full server-to-server setup. Companies are still too skeptical to fully commit to SGTM I guess. Though, my feeling says that in 5 years time, high-end apps won't use CGTM anymore.
Once again, thanks for your answer, it’s been an important part of my journey.

Related

Using waze functions with an external app

I'd like to know if it is possibile using waze's API about the community like danger points for an external app and integrated them with some our features.
I'm afraid Waze has strict rules concerning to who they expose the reports sent in by users. They provide a feed of reports for specific areas through the Waze Connected Citizens Program (CCP), but this program is generally only available for governmental institutions or road authorities. Generally the program also expects a dataflow towards Waze (road closures, for example).
That said, I don't work for Waze and I don't know all the details of the app you are working on. Depending on the type of app you are talking about, they may like to collaborate. You could always try to apply as a CCP partner and perhaps they may make an exception. Just be aware that while Waze has a lot of users, it is actually not a very big company so support queries may take a while to get a reply.
Note that it is technically possible to scrape the information from the Waze Live Map, but I'd strongly advice against doing that without permission as it could lead to legal actions.

Falcor GraphQL in big project

I read a lot of articles about Falcor and GraphQL. And noone can say how they help in big projects! I use Redux + React for a long time (also RESTAPI), can't understand what BIG problem Falcor and GraphQL solve.
Someone can explain it in very simple way ?
When you try to understand a new thing such as GraphQL, it helps to compare it with something existing, for example, REST, which you already know.
Imagine we have several web and mobile applications that retrieve data from the same server. In RESTful architecture, we design each entity as a resource. When request for fetching a resource is received, the server usually returns everything about that resource. Thus, the clients get redundant and unnecessary data which consumes bandwidth. Depending on the scenario, this can total an amount significant enough for the client's performance (think about mobile clients).
How about the clients specifying exactly which data they need and the server sends only those data? GraphQL enables us to achieve this.
Is GraphQL suitable for BIG projects?
Like pretty much everything in life, it depends. Not all projects, regardless of their sizes, have the same requirements. Determine your project's requirements. Consider the available technologies and their pros and cons. It's a trade-off. There's no silver bullet or one-size fits all solution. Nonetheless, Facebook uses GraphQL and there are strong reasons to consider their project as BIG.

Will Google block my access if I use their features without token?

I'm using this link https://www.google.com/reader/api/0/stream/contents/feed/FEEDHERE?output=json&n=20
to fetch feeds using Google's algorithm. As you can see I'm not adding any other parameters, just fetching the returned data in JSON format. My app will be heavily used hopefully and if I send a lot of requests to this link, will Google block my access or something?
Is there anything I can include, like userip, url for my app (so if they have problem to just contact me) or something else?
The most basic answer to your question is that Google will change its Terms of Service whenever it likes, and you've got no say in the matter. So if it's allowed today, it might not be allowed tomorrow, at Google's whim.
On this issue, though, you seem fairly safe. From the Terms of Service (these is the general document, since Reader doesn't seem to have a specific one):
Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
Google provides RSS and Atom. They provide these feeds, so I assume they expect that they'll be used. They don't say that it's a misuse to point someone else at those feeds, so it looks OK for now, but they could add such a clause at any time.
All online services are subject to the terms and conditions of the providers of those services. So, as others have said, they may be ok with your use today, but they can change their mind any time down the line. I doubt including a URL or email or contact info will help anything, because when these services change, they don't notify every user of the service, they just announce the change publicly, and usually they give several month's notice in order to give users a chance to adapt their applications, but this is not standardized or enforced so there is no guarantee. One example would be the fairly recent discontinuance of the Google Finance API (for which no replacement has been announced).
The safest approach would be to design your app such that this feature that uses google's functionality is decoupled as much as possible from the rest of your app, so that, when or if the availability of the service changes (ie it's no longer available at all) you can adapt your app to use some other source for the feeds with minimal impact to the rest of the app. Design for change and plan for the worst.

Google Analytics vs ddos

What i'm wondering is, what kind of behaviour does google analytics show when a ddos attack occurs? Any theories?
My theory would be that an effective DDoS platform/script would not include anything as heavyweight as a JavaScript engine, and that therefore the DDoS activity would not show up in Google Analytics at all.
The point of a DDoS attack is to overwhelm the server with a flood of requests. Any CPU cycles that are spent evaluating JavaScript in the response that the server sends back are cycles that could better be used churning out more requests to the server. I would fully expect a properly executed DDoS attack to not waste time parsing the response from the server, or even reading it off of the underlying socket, let alone interpreting and executing and JavaScript that may be embedded in the markup or fetching scripts and other resources from domains other than the target server.
Of course, this does not preclude the possibility of an exceptionally naive DDoS attack implemented using web frameworks and libraries that do evaluate embedded JavaScript. Such an attack would not (or rather, should not if you've implemented your server code correctly) be very effective, but it would likely generate a spike in Google Analytics traffic.
It depends on the way that the DDOS is implemented. If it's simply an executable distributed to multiple machines, making simple HTTP queries using native TCP sockets, then Google Analytics wouldn't notice anything at all: because the JavaScript that gets returned would never be executed.
However, other sorts of DDOS attacks could leverage actual browsers distributed across many machines. For instance, if you could hack the Yahoo home page and insert an <iframe src='takemedown.com'> into it, you could easily DDOS "takemedown.com". In this particular scenario, GA would certainly detect the impressions, and because (depending on the scenario) there might be an HTTP referrer tag, you could possibly run a report in GA that could pull out the suspicious impressions.
But there are other similar scenarios that wouldn't leave any particular footprints. For instance, if you could hack Lady Gaga's twitter account, you could send out a link to her 16MM followers, and a significant number would immediately click on it: and since most of those clicking on it would probably be doing so from within a separate app, there wouldn't be any referrer tag, and no particular way of identifying the requests.
In other words, it all depends, but it's probably not a terribly useful avenue to investigate. In many (most?) scenarios, GA wouldn't even recognize the impression; and in many others, wouldn't have any reasonable way of picking out the good impressions from the bad.
It will show up 100% some significant peaks in google analytics , simply because there are huge number of requests from multiple sources having huge bounce rate !
When a HTTP DDoS attack occurs the attacker is either using several (thousands) of computers to do so. Sometimes, it's also done with servers. When they make the request, they don't render the javascript or anything - they simply in most cases just make a GET request to the webpage.
So no, it shouldn't really have an impact on GoogleAnalytics
Well, I'm also searching this kind of information, but I have some considerations about the answer:
You will probably not see the attack itself with Google analytics, but you should see the results, I mean, a DDoS is a "distributed deny of service", so, if the service is effectively denied, then you should see a flat line on the graph on Google analytics.
It depends how the bot works, but here's what happened to my website:
Google Analytics real time report for the monk
As well as the increase in traffic you will likely see your bounce rate go sky high and average time on page significantly drop - which I'm sure can have a negative impact on SERPS.
For me it coincided with a Google update so first I put it down to that, but I started getting a lot of traffic to the root page, terms, and privacy, with many prefixed with /?m=0 which is in itself odd (and I'd love for someone to shed light).
The attack caused a great deal of timeouts and was painful to fix:
In short, I hooked up CloudFlare, then created Security -> WAF rules to challenge countries where I was receiving most of the bot traffic. I also switched on the basic bot attack mode (there's a more effective super bot attack mode with the paid subscriptions).
The other interesting point of note was why was my site subject to a DDOS attack. I wish I knew, but at a similar time to when the attack started I was approached by someone who enquired about buying the website. Possibly a tactic to get me to sell it/sell it cheap.

Check if anyone is currently using an ASP.Net app (site)

I build ASP.NET websites (hosted under IIS 6 usually, often with SQL Server backends and forms authentication).
Clients sometimes ask if I can check whether there are people currently browsing (and/or whether there are users currently logged in to) their website at a given moment, usually so the can safely do a deployment (they want a hotfix, for example).
I know the web is basically stateless so I can't be sure whether someone has closed the browser window, but I imagine there'd be some count of not-yet-timed-out sessions or something, and surely logged-in-users...
Is there a standard and/or easy way to check this?
Jakob's answer is correct but does rely on installing and configuring the Membership features.
A crude but simple way of tracking users online would be to store a counter in the Application object. This counter could be incremented/decremented upon their sessions starting and ending. There's an example of this on the MSDN website:
Session-State Events (MSDN Library)
Because the default Session Timeout is 20 minutes the accuracy of this method isn't guaranteed (but then that applies to any web application due to the stateless and disconnected nature of HTTP).
I know this is a pretty old question, but I figured I'd chime in. Why not use Google Analytics and view their real time dashboard? It will require minor code modifications (i.e. a single script import) and will do everything you're looking for...
You may be looking for the Membership.GetNumberOfUsersOnline method, although I'm not sure how reliable it is.
Sessions, suggested by other users, are a basic way of doing things, but are not too reliable. They can also work well in some circumstances, but not in others.
For example, if users are downloading large files or watching videos or listening to the podcasts, they may stay on the same page for hours (unless the requests to the binary data are tracked by ASP.NET too), but are still using your website.
Thus, my suggestion is to use the server logs to detect if the website is currently used by many people. It gives you the ability to:
See what sort of requests are done. It's quite easy to detect humans and crawlers, and with some experience, it's also possible to see if the human is currently doing something critical (such as writing a comment on a website, editing a document, or typing her credit card number and ordering something) or not (such as browsing).
See who is doing those requests. For example, if Google is crawling your website, it is a very bad idea to go offline, unless the search rating doesn't matter for you. On the other hand, if a bot is trying for two hours to crack your website by doing requests to different pages, you can go offline for sure.
Note: if a website has some critical areas (for example, writing this long answer, I would be angry if Stack Overflow goes offline in a few seconds just before I submit my answer), you can also send regular AJAX requests to the server while the user stays on the page. Of course, you must be careful when implementing such feature, and take in account that it will increase the bandwidth used, and will not work if the user has JavaScript disabled).
You can run command netstat and see how many active connection exist to your website ports.
Default port for http is *:80.
Default port for https is *:443.

Resources