My event counts between GA4 and UA are different. They are not drastically different (maybe about 10%) but the numbers are still off. If the tags and triggers are all the same in GTM, shouldn't the event counts be identical? what would cause it to be 10% off or is this normal??
First of all, it has at this point become agreed upon that GA4's pre-baked reports are less than reliable. We now suggest avoiding using pre-cooked reports in GA4 and instead either use the Explorer, or export the data and use something else entirely (preferably) if you have the resources for it.
Secondly, make sure you don't have Google Signals enabled since this changes thresholding/sampling logic. Also, switch to device-only reporting, it will help with thresholding. More on it here. It's important that the Signals are never enabled. It looks like the thresholding logic won't be fixed even if you disable them after enabling. Some report the thresholding reduce after you give it some months. Some claim it's due to Signals affecting the data and then the data can't be restored.
Event Properties Cardinality. Here's more about it. Despite what Google claims, GA4 is still full of bugs and unpleasant features. One of it would be the cardinality of your eps' values. Keep it low. Otherwise, sampling kicks in hard and you end up seeing a large percentage of your events as Other(Other). Even when you don't use the high cardinality dimension in your report.
Data retention. See the limitations on it here. Yep, no more free access to old data for your precious ad-hoc YoY analysis, so if you're counting old events, no luck. UA will show them to you, but GA4 wipes them. GA4 tries to still maintain the pre-cooked aggregated reports, but now you can't drill into them as you used to in UA, and they're not accurate anyway.
These are generic suggestions. More debugging would have to be done on your side to find out exactly what datapoints at what times aren't being counted. Data exports to BQ would help narrowing it down. But at this moment, the general consensus among analysts is that we shouldn't compare GA4's data to that of UA. I personally don't agree with that consensus, since it's always good to know the difference, but that has become almost an industry standard today.
I posted a finding to this thread.
had the same issue with purchase events being different for UA vs. GA4.
Universal Analytics was always showing higher numbers and the triggers were exactly the same.
Then I enabled data export to BigQuery and it turned out that GA4 shows only those transactions in the GA4 UI that have a value for the field user_pseudo_id (you only see this field in the BigQuery data export). There were transactions where the field was null and apparently these dont show up in the UI.
I would recommend looking at raw event in BigQuery, the data export is for free as long as you dont go crazy with ETLs and queries.
Considering doing some relatively large scale event tracking on my website.
I estimate this would create up to 6 million new events per month in Google Analytics.
My questions are, would all of this extra data that I'm now hanging onto:
a) Slow down GA UI performance
and
b) Increase the amount of data sampling
Notes:
I have noticed that GA seems to be taking longer to retrieve results for longer timelines for my website lately, but I don't know if it has to do with the increased amount of event tracking I've been doing lately or not – it may be that GA is fighting for resources as it matures and as more and more people collect more and more data...
Finally, one might guess that adding events may only slow down reporting on events, but this isn't necessarily so is it?
Drewdavid,
The amount of data being loaded will influence the speed of GA performance, but nothing really dramatic I would say. I am running a website/app with 15+ million events per month and even though all the reporting is automated via API, every now and then we need to find something specific and use the regular GA UI.
More than speed I would be worried about sampling. That's the reason we automated the reporting in the first place as there are some ways how you can eliminate it (with some limitations. See this post for instance that describes using Analytics Canvas, one my of favorite tools (am not affiliated in any way :-).
Also, let me ask what would be the purpose of your events? Think twice if you would actually use them later on...
Slow down GA UI performance
Standard Reports are precompiled and will display as usual. Reports that are generated ad hoc (because you apply filters, segments etc.) will take a little longer, but not so much that it hurts.
Increase the amount of data sampling
If by "sampling" you mean throwing away raw data, Google does not do that (I actually have that in writing from a Google representative). However the reports might not be able to resolve all data points (e.g. you get Top 10 Keywords and everything else is lumped under "other").
However those events will count towards you data limit which is ten million interaction hits (pageviews, events, transactions, any single product in a transaction, user timings and possibly others). Google will not drop data or close your account without warning (again, I have that in writing from a Google Sales Manager) but they reserve to right to either force you to collect less interaction hits or to close your account some time after they issued a warning (actually they will ask you to upgrade to Premium first, but chances are you don't want to spend that much money).
Google is pretty lenient when it comes to violations of the data limit but other peoples leniency is not a good basis for a reliable service, so you want to make sure that you stay withing the limits.
I'm currently researching a solution to monitor the performance of specific sections of a page. For example, you have a simple page with 2 images with links to other pages. You are driving lots of traffic to this page and you are experimenting with different contents on that page.
6 months after, you want to see which section of the page performed better with what kind of specific imges.
Let's imagine you require a report that should tell you the following: on average, the first spot performs better, but last week the image was bad and that's why you had less conversion from that spot.
I'd like to use such a system on a high-traffic homepage of an eCommerce website, in order to better monitor the usage of the selling spots.
I was thinking to use Google Analytics events with a positioning scheme (splitting the website in columns and rows, giving to each cell an identification ID such as a1 for column a, row 1) and keeping a local datawarehouse of creatives (images, promotions etc.), but apparently, after 10.000.000 hits per month, Analytics is recommending the premium version which is quite pricey (12k USD per month, 1 year upfront payment).
I was thinking about PIWIK as an alternative, but there is no event tracking there - or am I missing anything?
Looking forward to hearing your input on this matter.
You're better off with a provider like Optimizely for this use case. Still gonna be expensive, but it'll more quickly get you the information you need to make decisions.
We normally use multi variation tests or A/B tests to measure the success of user interfaces. Google Analytics have this feature and it is free.
This links maybe useful
https://www.youtube.com/watch?v=yDWTMOC_Dp4
https://support.google.com/analytics/answer/1745147?hl=en
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm trying to determine why an enterprise wouldn't want to use Google Analytics.
Here are the main reasons I've seen mentioned:
Inability to track clients that have Javascript disabled.
Lack of ownership of the statistics - Google owns the data.
Most of the web clients with Javascript disabled will probably be bots/spiders. This data is interesting, but probably not very useful.
As for the ownership issue, this is a bit paranoid IMO.
What am I missing here? When is Google Analytics not good enough?
Here are my findings from additional research:
Google Analytics is limited to 5 million page views per month - source
If a web site generates more than 5 million pageviews per month it will need linked to an active AdWords account to avoid interruption of service.
Lack of / slow technical support
All Google support is handled through email and response times can take a week or more. Commercial analytics products often have much faster & personalized support.
Inability to track files (PDF's, Images, etc.)
GA relies on Javascript and files lack the ability to execute Javascript. The workaround to this problem is to tag the link, but this won't track requests that go directly to the file.
Limited ability to customize
This is a selling point that I see pushed by commercial analytics tools (WebTrends). However it's never explained what customizations are denied by GA but allowed by WebTrends.
The Google Analytics EULA does not allow you to track individual users by identifying them. So if you wanted to add a custom variable for username to track how many times each user logs in, then you would be in a gray zone if not outright violating the EULA.
I use Google Analytics on about 10 sites right now and it's a great tool. In addition to all the analytics stats, you can tie it in with AdSense and it becomes a marketing/revenue tool and not just "wow look at all these cool user stats". If there was a way to track by user ID in certain circumstances (e.g. if user's agreed to it, or if they work for the company that owns the site) then I would have no issues.
Besides, it's free and all you have to do is add JavaScript to the files, so give it a try and see what you think after a few months.
One reason that was, surprisingly, not posted:
timing / speed of reaction
It takes at least 4 hours (up to 24) for GA to update your data.
This is ok for me personally in most of the cases, but when reacting fast is crucial (news sites, one-off events, etc.) you may want to employ some other solution (Mint comes to mind, but it's not the only one out there of course).
Thought I'd add my two pence worth to this thread, as this a topic close to my heart and one I've debated with colleagues for years. We've used webtrends in house for as long as i can remember, back to version 4 of the log analyzer (how different things were back then!). Since Google Analytics came along, we've started to come under increasing pressure from certain parts of our business to switch, as 'it does everything we need form an analytics tool'
Well, true in many senses it does, especially these days. But I championed the integration of our CRM and web analytics tools back in 2006, and as our business isn't e-commerce (the 'conversion' happens offline, sometimes months after the visitor acquisition) we need to integrate in this way to get a true picture of campaign effectiveness, and notion of ROI.
All of this means, we need access to the raw data, need to be able to join visitor records on sessionID etc, without this access we'd be screwed. I'd love it if we could roll without it, but the current requirements mean we can't, so this alone is a HUGE reason why Google analytics is not good enough.
Over and out
For tracking desktop software or creating a whitelabel solution there are better solutions.
For white label an integration based analytics, i use MixPanel. For Desktop Software, i use Deskmetrics
Google Analytics does not work well with mobile phones. While the iPhone and the Palm may be supported, many of the existing handsets do not support the javascript that Google uses.
If you're based in the UK, then theoretically you could be breaking the Data Protection Act by using Analytics.
If information about your users (like which web pages they're looking at) goes "outside the European Economic Area" and onto Google's servers in the US, then you're breaking the DPA.
Pretty obscure, but you did ask :)
Piwik avoids the problem because you host it on your own servers.
Lack of ownership of the statistics - Google owns the data.
... As for the ownership issue, this is a
bit paranoid IMO.
One problem with it is that we can't even access the raw data. We had a use case this week where we wanted a visitor map for an executive presentation. We needed to get more flexible with how the visitor map is displayed (wanted to view the map in Google Earth plug-in). In GA, you can't. You take what they give you. You can see a map of how many visits came from each city, but you can't export a data file of cities and number of visits, to run the data through other tools. So, paranoia aside, there are significant limitations on what you can accomplish with GA.
However this is not a problem if you use Urchin, the self-hosted version of GA: you can export the data and do what you want with it. (And the exported data is richer than the web server log's, as it includes some analysis already.)
Since Piwik is open source, and pluggable, I imagine you could enhance the visitor map plug-in any way you wanted to. And export whatever data you want.
Whether this limitation affects you depends on your needs, obviously.
Update: I've now looked at the GA Data Export API, and it turns out that things you cannot do through the UI (as you can with Urchin), you can do with this API. It does look like you can export the visit data I was talking about, via a feed (although there are daily traffic caps on those requests). So sprinkle salt heavily on what I wrote above.
A couple more points that I've come across:
GA doesn't let you dig beyond full-day statistics; I would often like the ability to investigate whether a traffic dip the previous day was caused by the design update I did at 1pm or the soccer match on TV at 8pm.
GA doesn't offer a workaround for traffic spikes caused by DDoS attacks, Slashdotting etc. When I'm looking at a GA visitor graph of 2009, all I can see is the 2-million-pageview-spike on October 16th, pushing the entire rest of the year down flat against the horizontal axis of the graph. To get a meaningful graph, GA should offer the ability to trim or exclude outlying data points, or the ability to limit/bracket the graph window itself
GA doesn't have an event monitoring client (think Reinvigorate's Snoop tool)
While GA is very user-friendly, I've found it's not as granular as some of the other stats programs (or maybe I'm not looking in the right places). Before the marketing monkeys I work with began pushing GA, we were very satisfied with AWStats. The sheer scope of the data helped us on several occasions hone sites to better suit their audience. While GA is very shiny and laid out well, I personally still prefer the raw numbers like I used to get through AWStats.
Slow data processing speed - Can be as low as 15-30 mins for page views, but may be up to 48 for eCommerce
EULA is limiting in some cases
You won't own or have any control of the data. Google's engineers might use it (anonymously) for testing
Anything more complex requires customization - Downloads and such care of no issue, but there are limits
Cross domain tracking by linker is faulty at best
Visit based - Proper tools are based on Visitor level, GA works on Visit based reporting mostly
Limited number of custom vars used at one time (5)
No tech support, if you're realistic
Usually when there is a downtime notice, it's already gone
API limitations (4 dimensions and 10 metrics at one time, not all can be used together in addition to that)
I have many more, but at the end of the day it is a good tool for it's price.
From the non-technical point, I think the most important is that some enterprise has the high level data security policy. All of the data should be controlled and managed by themselves.
If you use the Google analytics,the data is stored in google's server. For some special enterprise, like insurance, financial company. The policy should be followed.
I would NOT go with server logs. In fact I have them disabled on my server. Why you ask me?
For the simple reason that everytime you hit my server that stupid logging program makes an entry in the physical log file on my HDD. So if my server gets 100,000 hits in a day that's 100,000 time a HDD write operation happens.
You think that's cool? Well it's not. It's slowing your server down, specially if the log file is huge.
Why would someone even consider doing that to their server? Specially when we're working so hard to minify javascript, css and make image files 2 KB smaller!
Please do yourself a favor don't log directly on your server.
At least Google Analytics logs it on Google's server so my server's healthier.
I wouldn't use it for any of my sites, because you're forcing the user to accept your proprietary JavaScript code in their browser, which is bad. Also, giving your data is Google is a really bad idea.
See Piwiki for something you can run yourself as in free software, eliminating both of the problems.
I want to count the visits on a web page, and this page represents an element of my model, just like the Stack Overflow question page views.
How to do this in a reliable (one visit, one pageview, without repetitions) and robust (thinking on performance, not just a new table attribute 'visits_count')
You can't.
Multiple people can visit your site from the same IP (makes ip storage useless)
Multiple people can visit your site from the same PC
Multiple people can visit your site from the same browser (makes cookies useless)
The closest you can get is to store the visitors IP in combination with a cookie to not count those in the future. Here is a tradeoff, if they clear the cookie, they are a new visitor. If you only store the IP, you count whole proxies as one visitor.
Another option is to use user accounts and track exactly which user viewed what page, but this is not really a good option for public sites.
I would suggest, actually, using Google Analytics. While nothing will ever be truly accurate, they do take a pretty gosh-darn good stab at it, and the level of reporting and useful information you can extract is amazing.
Not to mention being free for up to 4,000,000 page views a month.
http://www.google.com/analytics
One way to handle the performace issue, is to do the calculations on insert. Something like
UPDATE stat SET viewcount = viewcount+1 WHERE date = CURDATE()
It isn't the perfect solution but could get you some of the way.
To do it without repetitions is potentially a tricky problem. I would simply rely on using sessions or cookies, but you could come up with all kinds of strategies for filtering crawlers, clients with out cookie-support and so on.