Google Analytics - Absolute Unique Visitors query

Google Analytics - Absolute Unique Visitors query - google-analytics

We've recently installed Google Analytics on our intranet site which - obviously - is only available to internal staff. There are around 4000 members of staff in our place of work and so we can have an idea of the upper ranges of our "unique visitor" count.
For the period Mar 15, 2011 - Apr 14, 2011, there have been "10,307 Absolute Unique Visitors", averaging out at around 2000 - 2500 Absolute Unique Visitors hitting the site each day.
Is this metric telling me what I think it is? That is, in the period stated, 10,307 different people visited our intranet site. If so, how can this be, when we only have around 4000 staff?
Any help much appreciated.
Thanks

Unique visitors are tracked using cookies.
If each member of your staff uses two or three browsers (eg, Firefox and IE, or laptops, desktops, and smartphones), you'll get 10K unique visitors.

It could be that your organisation infrastructure forces cookies to be cleared after every session, or when users log out of their machines.
When cookies are cleared, a subsequent visit will be considered by a new visitor.
I would suggest taking a look at a user's repeat visits and ensure that the _utmz cookies does indeed persist.

Related

A good estimate as to how many visitors are bots

My blog is about 7 months old. At my current level I usually get around 100 sessions per day. I have always actively filtered out all ghost referrers as they appear and thus should have virtually none of them appearing in my Google Analytics data. I have also checked the box that instructs Analytics to ignore "known bots".
So I'm wondering after all these measures, how many of my sessions each day should I still reasonably chalk up to bot traffic?
And a side question, is there anything else I can do to make my Analytics data more accurate in detecting only real human traffic?

One thing you could do is add an invisible link to you main page anyone clicking on the link has a very high probability of being a bot.

Technical problems when tracking Google Analytics

We've been tracking visitors to our site for over a year now, and when comparing last year to this year, site visitation, unique visitors, etc. have all been cut in half (roughly, not exactly half).
There isn't really a marketplace explanation for the decrease, and we're wondering if there are any technical problems we may have had to cause this to happen. We had another developer working on the site last year (who is no longer with us), and we're wondering if maybe the tracking code had been placed improperly. Our current developer looked back at the code during this time period and said that is not the issue.
Any other ideas as to why our analytics might be so off kilter?
Thanks,

Some of the reasons why this could happen are:
Some of the pages were removed.
Some code was removed (you say that this is not the case.)
there was some problem specific to your site (like large number of international users etc)

Google Analytics and Piwik Discrepancy

Hi guys, I was wondering if anyone have the same problem as I do. I have 2 trackers which are Google Analytics and Piwik but after sometime I found out there is a discrepancy. Please read below for more information.
Here is data for yesterday (with New Piwik Last Week v1.7.1 version then).
GGA : 14 803 visits (Unique Visistors)
Piwik : 10 254 visits (Unique Visistors)
31% discrepancy.
Question
What do i have to do to match the records? or which of the statistics is the correct ones?
Any advice would be much appreciated.

Respective to the different programs they are both correct. The difference comes in in HOW they calculate what a unique visitor is. No two stats aggregators work the same.
Google Analytics What's the difference between the 'Absolute Unique Visitor' report and the 'New vs. Returning' report?:
Absolute Unique Visitors
In this report, the question asked is: 'has this visitor visited the website prior to the active (selected) date range?' The answer is a simple yes or no. If the answer is 'yes,' the visitor is categorized under 'Prior Visitors' in our calculations; if it is no, the visitor is categorized under 'First Time Visitors.' Therefore, in your report, visitors who have returned are still only counted once.
Piwik FAQs:
How is a 'unique visitor' counted in Piwik?
Unique Visitors is the number of visitors coming to your website; Unique Visitors are determined using first party cookies.
If the visitor doesn't accept cookie (disabled, blocked or deleted cookies), a simple heuristic is used to try to match the visitor to a previous visitor with the same features (IP, resolution, browser, plugins, OS, ...).
Note that by default, Unique Visitors are available for days, weeks and months periods, but Unique Visitors is not processed for the "Year" period for performance reasons. See how to enable Unique Visitors for all date ranges.
They both use cookies to determine uniques, but both go about it calculating them in different ways. It's apples and oranges when comparing stats packages side by side.
Examine the rest of the stats beyond unique visitors. If there is a wide margin across the board, take a close look at the implementation of both.
If all is well with both implementations, then pick one and go with it for the stats. Overall trends is what you are looking for. Are the stats you want to go up going up? Are the stats you want to go down going down?

Possible Google Analytics Bug - Traffic Sources Total Visits not matching Total Visits in other reports

Has anyone else seen this issue?
As of roughly 2 weeks ago, I get conflicting figures for the Total Visits metric between the Traffic Sources report and the other reports (e.g. Visitors, Dashboard). For example, for the week of 5/9/2010 through 5/15/2010, the Dashboard and Visitors reports both say 386 Visits. The Traffic Sources report says 157 Visits, and the 4 main source types (Search, Direct, Referral, Other) sum to 157 Visits, not 386.
Any ideas? Is this a known bug, or could there be a configuration issue?
Thanks.

Well it seems that quite a few GA users have observed unaccounted-for behavior, particularly during the past couple of months.
For instance,
18 - 19 May 2010:
more than 40 different GA users posted to the GA
User Forum all regarding the same
issue: no data whatever was
recorded in their GA Accounts
during the 18th and 19th of May. No
response from Google and nothing in
the GA Blog. Several users who had
other GA accounts that were functioning normally during this period, suggested that the problem might be caused by recent changes by Google to the GATC (which was in fact recently revised)--many of those who posted on the Forum said that indeed they had recently added the latest version of the GATC to their Sites/Pages.
6 - 9 May 2010:
Over 50 GA users reported, by
posts to the GA Forum, a complete
GA outage during the period 6 - 9 May
(no data appearing in their reports
for at least one of those days). This
time a GA Team member did respond with
a one-line response "there was a
delay in reporting, no data was lost."
This post also referenced a Twitter
message 4 from GA stating the same
thing.
In addition, i've seen a half dozen, perhaps more, recent posts (past 60 days) on the GA Forum in which users reported significant discrepancies between an aggregate figure and the sum of the constituents--both sets of figures from the same Report, e.g.,
Numbers Don't add up on the Absolute Unique Visitor's Report
Search Engine drill-down visitors don't match total
Neither Post was answered (either by the GA Team or anyone else).
Finally, since it's just a matter of clicking a menu and selecting a different option, i suggest comparing the figures you recited in your question with the analogous figures for Page Views, which is probably the simplest measurement in client-side analytics ("Visits" by contrast is strongly influenced by user cookie manipulation).

Through some trial-and-error looking at every specific source, I've traced the error to one item: within the Traffic Sources reports for the affected days (the issue seems to have partially righted itself as of yesterday's data, at least for my account), the delta/error/black hole was always equal to the Google CPC Search traffic for that day.
I have no idea what's causing the issue, but at least I know how to manually attribute the numbers. Hopefully Google has fixed this...
Thank you to all who commented/answered. I appreciate it.

Basic site analytics doesn't tally with Google data

After being stumped by an earlier quesiton: SO google-analytics-domain-data-without-filtering
I've been experimenting with a very basic analytics system of my own.
MySQL table:
hit_id, subsite_id, timestamp, ip, url
The subsite_id let's me drill down to a folder (as explained in the previous question).
I can now get the following metrics:
Page Views - Grouped by subsite_id and date
Unique Page Views - Grouped by subsite_id, date, url, IP (not nesecarily how Google does it!)
The usual "most visited page", "likely time to visit" etc etc.
I've now compared my data to that in Google Analytics and found that Google has lower values each metric. Ie, my own setup is counting more hits than Google.
So I've started discounting IP's from various web crawlers, Google, Yahoo & Dotbot so far.
Short Questions:
Is it worth me collating a list of
all major crawlers to discount, is
any list likely to change regularly?
Are there any other obvious filters
that Google will be applying to GA
data?
What other data would you
collect that might be of use further
down the line?
What variables does
Google use to work out entrance
search keywords to a site?
The data is only going to used internally for our own "subsite ranking system", but I would like to show my users some basic data (page views, most popular pages etc) for their reference.

Lots of people block Google Analytics for privacy reasons.

Under-reporting by the client-side rig versus server-side eems to be the usual outcome of these comparisons.
Here's how i've tried to reconcile the disparity when i've come across these studies:
Data Sources recorded in server-side collection but not client-side:
hits from
mobile devices that don't support javascript (this is probably a
significant source of disparity
between the two collection
techniques--e.g., Jan 07 comScore
study showed that 19% of UK
Internet Users access the Internet
from a mobile device)
hits from spiders, bots (which you
mentioned already)
Data Sources/Events that server-side collection tends to record with greater fidelity (much less false negatives) compared with javascript page tags:
hits from users behind firewalls,
particularly corporate
firewalls--firewalls block page tag,
plus some are configured to
reject/delete cookies.
hits from users who have disabled
javascript in their browsers--five
percent, according to the W3C
Data
hits from users who exit the page
before it loads. Again, this is a
larger source of disparity than you
might think. The most
frequently-cited study to
support this was conducted by Stone
Temple Consulting, which showed that
the difference in unique visitor
traffic between two identical sites
configured with the same web
analytics system, but which differed
only in that the js tracking code was
placed at the bottom of the pages
in one site, and at the top of
the pages in the other--was 4.3%
FWIW, here's the scheme i use to remove/identify spiders, bots, etc.:
monitor requests for our
robots.txt file: then of course filter all other requests from same
IP address + user agent (not all
spiders will request robots.txt of
course, but with miniscule error,
any request for this resource is
probably a bot.
compare user agent and ip addresses
against published lists: iab.net and
user-agents.org publish the two
lists that seem to be the most
widely used for this purpose
pattern analysis: nothing sophisticated here;
we look at (i) page views as a
function of time (i.e., clicking a
lot of links with 200 msec on each
page is probative); (ii) the path by
which the 'user' traverses out Site,
is it systematic and complete or
nearly so (like following a
back-tracking algorithm); and (iii)
precisely-timed visits (e.g., 3 am
each day).

Biggest reasons are users have to have JavaScript enabled and load the entire page as the code is often in the footer. Awstars, other serverside solutions like yours will get everything. Plus, analytics does a real good job identifying bots and scrapers.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex