At work we recently made some changes to our URL structure (permanent redirects to new urls), redirecting various routing rules to a consistent format/page. This effectively changed 90% of our URL structure.
In Google Analytics, we've seen a bump of "visitors" of nearly 30%, but our bounce rate has seen a similar spike. Where is this coming from?
Upon inspecting further, when drilling into the "network" and location of the visitors from "Mozilla Compatible Agent" in Google Analytics, the majority of the traffic was coming from two locations (Hialeah, Florida and San Fransisco, California), both stemming from Microsoft's network. This traffic was present all along, but in under 1% of our overall traffic until the recent change.
I can only posit that it has something to do with Bing (likely Bing's Preview functionality) that is actually executing each page in browser instances and capturing the result.
I think the surprising thing is that Google isn't already filtering the results, but being that most crawlers/spiders don't execute JavaScript, the functionality may not be present.
The caveat here is to take this into account in terms of tracking your stats. We now have a filter against traffic coming from "microsoft" as the network source so that in the future this doesn't affect our stats so strongly.
Related
Recently I've been experiencing a large amount of (what I think is) ghost traffic.
I need help in creating a filter to exclude this traffic from my Google Analytics.
URL's are showing up that have other websites appended to them.
Almost all articles I've read mention including only relevant hostnames but this doesn't seem to apply to my situation.
Here you can see the URL's with other random website addresses.(overworlf.com/evite.com/shmoop.com and many others)
Here is a screenshot of the hostnames none of them are out of the ordinary. I suspect this ghost traffic is using my main domain looking at the huge amount of users.
Posted the same question at stackexchange, someone there was able to help me
https://webmasters.stackexchange.com/a/118666/94264
"Almost all the analytics spammers insert data into your stats by pinging the GA tracker directly with fake data. They never visit your site and they usually just guess at the tracking id without knowing website host name associated with it. They won't send a host name, so it wouldn't appear in that report. See How to fight off Google Analytics referrer spammers?
That appears not be the case here. In this case these appear to be actual hits to your website. I tried one of those "top active pages" and it gives a 404 error. It looks like your 404 template has the GA tricking snippet installed on it. I don't think that is best practice. You could try taking the snippet off your 404 page. Then if you did get actual hits to such URLs, GA wouldn't count them as pages."
This can happen when there are search and replace or advanced filters. Are there filters on your view that alter the Request URI?
EDITED AFTER IT WAS CONFIRMED THAT THERE WERE NO FILTERS:
Typically, tracking 404 pages is best practice (referring to your other post).
I don't believe that removing the tracking from that page will help anyway. Like the other poster mentioned, these hits are sent from bots most of the time and they never actually land on your site. The hit is sent directly to your property with an http call. It bypasses the site completely, so whether there is a 404 page or not, the hit will show up in GA.
Adding an exclusion filter to exclude traffic with a page path (not hostname) ending in ".com"
I'm very new to online analytics. I just deployed a site a few days ago, told no one, and Google Analytics is saying I have hundreds of users and sessions all over the world.
Even if events are logging from my own development, there shouldn't be so many users (and so many sessions...I'm not developing THAT vigorously.)
Also, my server logs indicate the level of activity I expect: ~0. So it's not like I'm magically getting traffic somehow. It really is nonexistent.
What could be going on? I can understand seeing a few sessions here and there, for web crawlers, but I don't understand why the numbers are so high.
Any common gotchas?
I realize this is a vague question, but I'm not sure what other information to provide, so please let me know what I can do to help.
Traffic source
First check, if traffic comes through your website (through your analytics.js library). To do this, just remove analytics.js for a while and check, if traffic is still going into Google Analytics (e.g. Realtime report).
If is still going, maybe somebody use Measurement Protocol to spam your account.
To prevent this, add, for instance, custom parameter into your call and create filtered view only for this. All without this param, throw away.
Check sessions and returning visitors
Check, if the traffic is random (usualy one pageview per session) or if the behavior of users is normal.
Custom client ID
Check if you dont play with client ID in analytics.js configuration. IF you dont have random number generator there.
Check traffic source (referal), browsers
If there is one significant, or there is some pattern in versioning (absolute randomness is pattern too)
Preventing random access through website
For every visitor who is first-time on your page, set up a cookie with current timestamp. If cookie is not older than e.g. hour or day, do not track this user. Or buffer hits and fire them later after you prove the user is real.
Anyway, if you have some new hints or information from your analysis, we should help you better. This is still like reading a magic sphere :-)
How (im)precise is Google Analytics actually? I've been using Google Analytics for years now on a pretty well visited web site (800k+ visits per month).
Now I decided to log every page request in a database table, and I'm tracking the user-agent of the request. I have also eliminated bot requests (googlebot, bingbot and many more...)
What I found out is that I have almost more than double requests to a page than Google Analytics pageviews is willing to admit.
E.g. GA shows 137 pageviews to a specific URL, but I tracked even 255!
Google Analytics is VERY precise. It's not very accurate though. And that's the difference you're seeing since you're not looking into trends but instead at absolute numbers.
Start by reading this post by Avinash:
Reflections: Accuracy, Precision & Predictive Analytics
Bots are everywhere these days and a lot of times disguised as real user agents. You should come up with a testing to make sure client have both javascript and cookies enabled. In that case he'll be tracked by Google Analytics.
Besides that that some users might have adBlocks extensions that block Google Analytics. This is fairly uncommon but depending on the public can be more common. Tech savvy users have a higher chance to use a plugin that blocks GA, thus IT blogs might be hit by this harder than an average site.
The best way to test the real accuracy of Google Analytics ignoring user agents without javascript, cookies and that block GA tracking is to track the users on your site using GA as well. You can do that in Google Analytics using the LocalRemoteServerMode.
Add the following line at the end of your GATC (GA Tracking Code):
_gaq.push(['_setLocalGifPath', 'http://mysite.com/__utm.gif']);
_gaq.push(['_setLocalRemoteServerMode']);
Make sure to replace http://mysite.com/__utm.gif with a path on the same domain as your website and that respond a gif. Use a lightweight gif, like the one GA uses.
Then you can get the logs of access to this gif and see in their parameters the urls visited. You'll need to do some extra processing but you'll be using the same framework GA uses to collect data and thus will measure more efficiently GA precision.
More Info:
https://developers.google.com/analytics/devguides/collection/gajs/methods/gaJSApiUrchin#_gat.GA_Tracker_._setLocalGifPath
I'm using both on a site and getting very different numbers from each. Why is this?
The discrepancy is also mentioned in a Quora answer (Which is better, Facebook Insights or Google Analytics?)
Footnote: if you decide to use both, do not report them side-by-side,
and never expect them to match. Trying to explain the differences will
drive you mad.
Could someone explain?
This problem is quite common, and very hard to explain to clients why numbers do not reconcile amongst different analytics platforms.
Firstly, I believe that because there are remote connections to google or facebook some user sessions will get lost (What happens when they hit stop on the Browser page before the .js downloads for instance).
Secondly I believe ad blocking software may stop the file from being downloaded therefore the session is not captured.
Most hosting providers will have their own analytics platform with your hosting package. This is what I rely on as a true indicator for actual page views etc. These are usually generated directly from your web server logs so they are more accurate. Sadly I've never seen one of these packages have as many features as google or facebook.
There are tons of possible reasons. They might identify returning visitors in a different way or users might block scripts from a specific domain (e.g. *.facebook.com but not *.google.com). In general, ignore the discrepancy. Just pick one solution and use it. You'll always have visitors blocking all such scripts or just one or two specific trackers. The only (almost) 100% accurate way to do it, would be using local scripts, but even those could be blocked. You could as well look at open source solutions such as Piwik
Different web analytics products use diferent methods to track data on the site.
These differences between them is the reason why is hard do do a side-by-side comparison.
On the two links bellow you can find more info about that:
Why does Google Analytics report different values than some other web analytics solutions?
Using Google Analytics & Facebook Domain Insights to Track Social Actions on Your Website
In addition to the notes above, I also wanted to mention Google samples data when there are large volumes & dimensions. This may be a contributing factor.
Facebook reports on clicks and Analytics reports on pageviews.
The amount of pageviews might be less than the amount of clicks for a number of reasons:
There are filters on your Analytics that are blocking the pageviews from being recorded
The user left the page before the Analytics code could be recorded
Or the ads being clicked by bots and the Analytics isnt recording them
This seems to be a big problem with Facebook ads. I run a number of campaigns with facebook and I only see 30-50% of the reported traffic actually make it to the site. I cant believe this is due to only the first two reasons.
I have gone into more details on my blog http://www.bradtollefsen.com/facebook-ads-adding/
I have a web app that I deployed in AppHarbor with Google Analytics. Development is still ongoing and I test it very often live to checkout for example stuffs I did with the CSS, etc.
Everything is working fine but I'd like to know how many times I am accessing the website apart from the rest of the visitors who visits it. When checking the reports in Google Analytics it only shows me the ISPs of the visitors. I'll need something more drilled down like an IP address, but this seems to go against Google Analytic's policy and I do not know if this is even possible still.
Like right now I have 72 visits. But I have been testing so a lot of those could just be me. Would be good to know the actual visitor count.
I know this is probably a little late but you can set a filter to ingore your own traffic from reports. Here is how you do it.
In addition for adding a deprecated variable and using filters, you can build the code so that it only prints the tracking code if e.g. an identifier cookie is not found. Other common option is a URL parameter.
You can then set this cookie for your browser and be excluded from traffic.