A good estimate as to how many visitors are bots - google-analytics

My blog is about 7 months old. At my current level I usually get around 100 sessions per day. I have always actively filtered out all ghost referrers as they appear and thus should have virtually none of them appearing in my Google Analytics data. I have also checked the box that instructs Analytics to ignore "known bots".
So I'm wondering after all these measures, how many of my sessions each day should I still reasonably chalk up to bot traffic?
And a side question, is there anything else I can do to make my Analytics data more accurate in detecting only real human traffic?

One thing you could do is add an invisible link to you main page anyone clicking on the link has a very high probability of being a bot.

Related

Technical problems when tracking Google Analytics

We've been tracking visitors to our site for over a year now, and when comparing last year to this year, site visitation, unique visitors, etc. have all been cut in half (roughly, not exactly half).
There isn't really a marketplace explanation for the decrease, and we're wondering if there are any technical problems we may have had to cause this to happen. We had another developer working on the site last year (who is no longer with us), and we're wondering if maybe the tracking code had been placed improperly. Our current developer looked back at the code during this time period and said that is not the issue.
Any other ideas as to why our analytics might be so off kilter?
Thanks,
Some of the reasons why this could happen are:
Some of the pages were removed.
Some code was removed (you say that this is not the case.)
there was some problem specific to your site (like large number of international users etc)

Google Analytics - Visit duration 0 sec

I am using Google Web Analytics Online Tool to monitor visits on my site.
What bugs me is that often I see that records contain the folloowing entries:
Page Visits: 1.00
Average Visit Duration: 00:00:00
Bounce Rate: 100%
What does that mean?
If the visitor comes to my site it should stay at least couple of seconds until he leaves?
Could that mean that something is wrong with accessing my site (I had similar problems before, but I am convinced I fixed them since I am not getting any errors when I try to access my site from different computers.)
When a visitor comes to your page google analytics sets a cookie where a timestamp is stored. When the user visits a second page in your site Google compares the stored timestamp to the actual time and calculates visits duration from the difference between the two. If all your visitors have bounced there is no second data point to compare the stored value to and google is unable to compute a duration.
A common workaround is to set a javascript timeout and trigger an event after ten seconds or so (with the "interaction" flag in the event set to true, see Google Analytics event tracking docs for details). The assumption is that somebody who looks for more than ten seconds at you page is not actually a bounce (I think that since "bounce rate" has so hugely negative connotations people try to avoid high bounce rates even at the price of introducing bad data; you should realize that "bounce rate" simply means that there are not enough data points to say anything meaningful about those particular visitors).
Personally I do not like that approach because it means to redefine inaction of a visitor as action. A better idea (IMO) is to implement a meaningful interaction point - like a "read more" link that loads content via ajax or something like it - and track that via event tracking or virtual page view.
Event tracking guide:
https://developers.google.com/analytics/devguides/collection/gajs/eventTrackerGuide
Short Update: With Universal Analytics the technical details have changed (i.e. there are no longer cookies with timestamps, all information is processed on the GA servers). So the first paragraph is no longer up to date, however the rest of the answer is still valid.
I'm having a similar issue, i monitor those placements and recently found out the traffic is hardly getting to my site, recent experiment showed that those are placements triggered via clicks from GDN, but people have not even reached my page, were blocked by pop-up blocker or other similar software

Possible Google Analytics Bug - Traffic Sources Total Visits not matching Total Visits in other reports

Has anyone else seen this issue?
As of roughly 2 weeks ago, I get conflicting figures for the Total Visits metric between the Traffic Sources report and the other reports (e.g. Visitors, Dashboard). For example, for the week of 5/9/2010 through 5/15/2010, the Dashboard and Visitors reports both say 386 Visits. The Traffic Sources report says 157 Visits, and the 4 main source types (Search, Direct, Referral, Other) sum to 157 Visits, not 386.
Any ideas? Is this a known bug, or could there be a configuration issue?
Thanks.
Well it seems that quite a few GA users have observed unaccounted-for behavior, particularly during the past couple of months.
For instance,
18 - 19 May 2010:
more than 40 different GA users posted to the GA
User Forum all regarding the same
issue: no data whatever was
recorded in their GA Accounts
during the 18th and 19th of May. No
response from Google and nothing in
the GA Blog. Several users who had
other GA accounts that were functioning normally during this period, suggested that the problem might be caused by recent changes by Google to the GATC (which was in fact recently revised)--many of those who posted on the Forum said that indeed they had recently added the latest version of the GATC to their Sites/Pages.
6 - 9 May 2010:
Over 50 GA users reported, by
posts to the GA Forum, a complete
GA outage during the period 6 - 9 May
(no data appearing in their reports
for at least one of those days). This
time a GA Team member did respond with
a one-line response "there was a
delay in reporting, no data was lost."
This post also referenced a Twitter
message 4 from GA stating the same
thing.
In addition, i've seen a half dozen, perhaps more, recent posts (past 60 days) on the GA Forum in which users reported significant discrepancies between an aggregate figure and the sum of the constituents--both sets of figures from the same Report, e.g.,
Numbers Don't add up on the Absolute Unique Visitor's Report
Search Engine drill-down visitors don't match total
Neither Post was answered (either by the GA Team or anyone else).
Finally, since it's just a matter of clicking a menu and selecting a different option, i suggest comparing the figures you recited in your question with the analogous figures for Page Views, which is probably the simplest measurement in client-side analytics ("Visits" by contrast is strongly influenced by user cookie manipulation).
Through some trial-and-error looking at every specific source, I've traced the error to one item: within the Traffic Sources reports for the affected days (the issue seems to have partially righted itself as of yesterday's data, at least for my account), the delta/error/black hole was always equal to the Google CPC Search traffic for that day.
I have no idea what's causing the issue, but at least I know how to manually attribute the numbers. Hopefully Google has fixed this...
Thank you to all who commented/answered. I appreciate it.

When is Google Analytics not good enough? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm trying to determine why an enterprise wouldn't want to use Google Analytics.
Here are the main reasons I've seen mentioned:
Inability to track clients that have Javascript disabled.
Lack of ownership of the statistics - Google owns the data.
Most of the web clients with Javascript disabled will probably be bots/spiders. This data is interesting, but probably not very useful.
As for the ownership issue, this is a bit paranoid IMO.
What am I missing here? When is Google Analytics not good enough?
Here are my findings from additional research:
Google Analytics is limited to 5 million page views per month - source
If a web site generates more than 5 million pageviews per month it will need linked to an active AdWords account to avoid interruption of service.
Lack of / slow technical support
All Google support is handled through email and response times can take a week or more. Commercial analytics products often have much faster & personalized support.
Inability to track files (PDF's, Images, etc.)
GA relies on Javascript and files lack the ability to execute Javascript. The workaround to this problem is to tag the link, but this won't track requests that go directly to the file.
Limited ability to customize
This is a selling point that I see pushed by commercial analytics tools (WebTrends). However it's never explained what customizations are denied by GA but allowed by WebTrends.
The Google Analytics EULA does not allow you to track individual users by identifying them. So if you wanted to add a custom variable for username to track how many times each user logs in, then you would be in a gray zone if not outright violating the EULA.
I use Google Analytics on about 10 sites right now and it's a great tool. In addition to all the analytics stats, you can tie it in with AdSense and it becomes a marketing/revenue tool and not just "wow look at all these cool user stats". If there was a way to track by user ID in certain circumstances (e.g. if user's agreed to it, or if they work for the company that owns the site) then I would have no issues.
Besides, it's free and all you have to do is add JavaScript to the files, so give it a try and see what you think after a few months.
One reason that was, surprisingly, not posted:
timing / speed of reaction
It takes at least 4 hours (up to 24) for GA to update your data.
This is ok for me personally in most of the cases, but when reacting fast is crucial (news sites, one-off events, etc.) you may want to employ some other solution (Mint comes to mind, but it's not the only one out there of course).
Thought I'd add my two pence worth to this thread, as this a topic close to my heart and one I've debated with colleagues for years. We've used webtrends in house for as long as i can remember, back to version 4 of the log analyzer (how different things were back then!). Since Google Analytics came along, we've started to come under increasing pressure from certain parts of our business to switch, as 'it does everything we need form an analytics tool'
Well, true in many senses it does, especially these days. But I championed the integration of our CRM and web analytics tools back in 2006, and as our business isn't e-commerce (the 'conversion' happens offline, sometimes months after the visitor acquisition) we need to integrate in this way to get a true picture of campaign effectiveness, and notion of ROI.
All of this means, we need access to the raw data, need to be able to join visitor records on sessionID etc, without this access we'd be screwed. I'd love it if we could roll without it, but the current requirements mean we can't, so this alone is a HUGE reason why Google analytics is not good enough.
Over and out
For tracking desktop software or creating a whitelabel solution there are better solutions.
For white label an integration based analytics, i use MixPanel. For Desktop Software, i use Deskmetrics
Google Analytics does not work well with mobile phones. While the iPhone and the Palm may be supported, many of the existing handsets do not support the javascript that Google uses.
If you're based in the UK, then theoretically you could be breaking the Data Protection Act by using Analytics.
If information about your users (like which web pages they're looking at) goes "outside the European Economic Area" and onto Google's servers in the US, then you're breaking the DPA.
Pretty obscure, but you did ask :)
Piwik avoids the problem because you host it on your own servers.
Lack of ownership of the statistics - Google owns the data.
... As for the ownership issue, this is a
bit paranoid IMO.
One problem with it is that we can't even access the raw data. We had a use case this week where we wanted a visitor map for an executive presentation. We needed to get more flexible with how the visitor map is displayed (wanted to view the map in Google Earth plug-in). In GA, you can't. You take what they give you. You can see a map of how many visits came from each city, but you can't export a data file of cities and number of visits, to run the data through other tools. So, paranoia aside, there are significant limitations on what you can accomplish with GA.
However this is not a problem if you use Urchin, the self-hosted version of GA: you can export the data and do what you want with it. (And the exported data is richer than the web server log's, as it includes some analysis already.)
Since Piwik is open source, and pluggable, I imagine you could enhance the visitor map plug-in any way you wanted to. And export whatever data you want.
Whether this limitation affects you depends on your needs, obviously.
Update: I've now looked at the GA Data Export API, and it turns out that things you cannot do through the UI (as you can with Urchin), you can do with this API. It does look like you can export the visit data I was talking about, via a feed (although there are daily traffic caps on those requests). So sprinkle salt heavily on what I wrote above.
A couple more points that I've come across:
GA doesn't let you dig beyond full-day statistics; I would often like the ability to investigate whether a traffic dip the previous day was caused by the design update I did at 1pm or the soccer match on TV at 8pm.
GA doesn't offer a workaround for traffic spikes caused by DDoS attacks, Slashdotting etc. When I'm looking at a GA visitor graph of 2009, all I can see is the 2-million-pageview-spike on October 16th, pushing the entire rest of the year down flat against the horizontal axis of the graph. To get a meaningful graph, GA should offer the ability to trim or exclude outlying data points, or the ability to limit/bracket the graph window itself
GA doesn't have an event monitoring client (think Reinvigorate's Snoop tool)
While GA is very user-friendly, I've found it's not as granular as some of the other stats programs (or maybe I'm not looking in the right places). Before the marketing monkeys I work with began pushing GA, we were very satisfied with AWStats. The sheer scope of the data helped us on several occasions hone sites to better suit their audience. While GA is very shiny and laid out well, I personally still prefer the raw numbers like I used to get through AWStats.
Slow data processing speed - Can be as low as 15-30 mins for page views, but may be up to 48 for eCommerce
EULA is limiting in some cases
You won't own or have any control of the data. Google's engineers might use it (anonymously) for testing
Anything more complex requires customization - Downloads and such care of no issue, but there are limits
Cross domain tracking by linker is faulty at best
Visit based - Proper tools are based on Visitor level, GA works on Visit based reporting mostly
Limited number of custom vars used at one time (5)
No tech support, if you're realistic
Usually when there is a downtime notice, it's already gone
API limitations (4 dimensions and 10 metrics at one time, not all can be used together in addition to that)
I have many more, but at the end of the day it is a good tool for it's price.
From the non-technical point, I think the most important is that some enterprise has the high level data security policy. All of the data should be controlled and managed by themselves.
If you use the Google analytics,the data is stored in google's server. For some special enterprise, like insurance, financial company. The policy should be followed.
I would NOT go with server logs. In fact I have them disabled on my server. Why you ask me?
For the simple reason that everytime you hit my server that stupid logging program makes an entry in the physical log file on my HDD. So if my server gets 100,000 hits in a day that's 100,000 time a HDD write operation happens.
You think that's cool? Well it's not. It's slowing your server down, specially if the log file is huge.
Why would someone even consider doing that to their server? Specially when we're working so hard to minify javascript, css and make image files 2 KB smaller!
Please do yourself a favor don't log directly on your server.
At least Google Analytics logs it on Google's server so my server's healthier.
I wouldn't use it for any of my sites, because you're forcing the user to accept your proprietary JavaScript code in their browser, which is bad. Also, giving your data is Google is a really bad idea.
See Piwiki for something you can run yourself as in free software, eliminating both of the problems.

Inundated with marketing tracking pixels (Campaigns with multiple vendors)!

We have some third parties that are sending us traffic and have asked us to put a tracking pixel on the confirmation page so they can track through the sales.
We are currently using Google analytics for our own usage.
Google will remember the original referral through cookies. This may be a good or bad thing. If someone purchases through company B's link but they had originally found our site through company A - then company A still gets the 'referal'. That doesn't seem fair, but it seems to be the way google analytics works:
For example, if this is the user's
first visit to your site, the tracking
code will add the campaign tracking
information to the cookie. If the user
previously found and visited your
site, the tracking code increments the
session counter in the cookie.
Regardless of how many sessions or how
much time has passed, Google Analytics
"remembers" the original referral.
This gives Analytics true
multi-session tracking capability.
Currently we only have one tracking pixel on our 'receipt page' from a company that we're not even doing business with. Having a second company ask me for us to add one makes me thing 'wait a minute - we're going to suddenly be inundated with these things!'. Plus it means someone can look at the source and see all the people we do business with.
This isn't Oprah - you cant ALL have tracking pixels. Right ?
How should we manage sales from multiple traffic sources in the most honest way for both sides - especially if they already have a system set up that they insist on using?
Here's how I solved the problem at our company: we gave our partners a URL that has a parameter in the query string. This parameter triggers a cookie. On the "goal"/confirmation page (where the tracking pixel is usually inserted), we insert some logic to see if the cookie value is correlated with a one of our recognized partners (chained if-else or switch statement). If a match is found, then the tracking pixel is displayed.
Even though you asked this question a while ago, I hope that this still helps you or someone else with the same problem!

Resources