I am fetching my data from google analytics core api. I came to know that we can fetch only 7 dimensions using api, But here I need to fetch more than 7 dimensions with correct metrics. Is there anyway (other than using paid google analytics) to fetch more than 7 dmensions with correct metrics from google analytics.
If not, then is there any mathematical formula through which we can find intersection of dimensions fetched using 2 different dimensions having one dimension in common.
Thanks
The only way I have found around the 7 dimension limit is to be creative with filters. It will mean that you are sending more queries to the server, but if you aren't worried about your quota limit then its doable.
Example: ga:visitorType only has two values. New Visitor, Returning Visitor
So you could run a query with your 7 dimensions. Then add A filter ga:visitorType=New Visitor then run the same 7 dimensions with a filter added of ga:visitorType=Returning Visitor
Bingo you just got an 8th dimension.
Again you need to be a little careful because you are going to increase the number of queries you run by each value of dimension you are doing creative filtering on. I wouldn't recommend doing it with say ga:landingPagePath because the number of landing pages you have is to big. You will end up running too many queries. But there are a few you can do this with. ga:isMobile, ga:deviceCategory, ga:date. Just remember the more creative filters you add the more queries you end up running. I had one where I ended up expanding it to 80 requests just to get the data back I needed. Scary but it worked.
If you have specific dimensions that can be used to identify a session at a specific time (e.g. a session ID and browser timestamp), you can execute multiple queries then patch them together.
I built a python program that will do exactly this: https://github.com/aiqui/ga-download
This program can bring together multiple groups of dimensions, so that any number of dimensions can be downloaded and combined into a single CSV file.
Related
I am using the Google Analytics reporting tool in PHP.
I am having trouble when i am Querying the data when the Dimension data is not available.
For example :
These are the dimension
ga:date,ga:countryIsoCode,ga:eventCategory,ga:eventAction,ga:eventLabel which i have.
These are the metrics
ga:totalEvents which i have.
But the problem is that for some data ga:eventLabel is present and for some ga:eventLabel is not Present.
When i query using the dimension and metrics in the result set data does not include the data which does not have ga:eventLabel.
How can i get the data even when the ga:eventLabel is missing?
This is how GA reporting works.
Empty dimensions will almost always filter away the other dimensions. (few exceptions where GA uses (not set))
You either need to always add a value to the ga:eventLabel upstream while tracking or do the hard query work.
1. ga:eventCategory
2. ga:eventCategory,ga:eventAction
3. ga:eventCategory,ga:eventAction,ga:eventLabel
Then you - intelligently - stitch the dataset together.
You need to make sure you don't use result from 1 that is drill down in 2 etc.
Is it possible to export a list of all page views, with timestamps, out of google analytics?
Currently I can see page views per day, but not specific times for each view. I do see real-time data as it comes in, so I assume the time for each view is being stored.
If not, how might I go about sending that data when I report the page view in a way that will make it accessible in the dashboard?
[[UPDATE]] The Google Analytics v4 API now includes a "dateHourMinute" dimension.
Inside Google Analytics, I think the problem you will run into is that a "timestamp" includes Date, Hour, Minute, Second (and maybe millisec) and you can only drill into one secondary dimension at a time (ie: just date). An alternative method is to use the query explorer or just use the Google Analytics API.
Ad Hoc Report with query explorer
You can use the Google Analytics API Explorer: https://ga-dev-tools.appspot.com/query-explorer/
It will allow you to select:
A date range
Metrics: (like Users, pageviews, etc)
Dimensions: pagePath, Date, Hour, Minute (can't go to second or further, as in a timestamp, though)
Use filters to get rid of some data.
If your website gets a decent amount of traffic (> 50,000 pageviews per month), you might want to grab it 1 month (or less) at a time to avoid hitting limits/sampling effects.
#doctaj solution works. Please just note that if you are using the query-explorer for example, "dateHourMinute" dimension is considered 3 dimensions and will burn 3 spots from your available 7 or 10 dimensions that you can query. So if you need more dimensions, you need to decide which is more important to you. For example, you may want to query for the date and hour dimensions separately to drop the minutes in favour of something else.
I am trying to export more than 5000 rows to an excel sheet from google analytics, but I am not able to do so. I have searched a lot about it in google andtried the url modification which replaces explorer-table.rowCount%3D5000 with explorer-table.rowCount%3D50000. But it's not working for me. It still returns 5000 rows only. Please help. Thanks in advance
Here are some options to pull more than 5000 rows from Google Analytics.
1) Use the Google Analytics Query Explorer to pull 10,000 rows (API query max). Set the max-results to 10,000. Use the start-index to pull additional 10k row chunks (set start index to 10,001 then 20,001 then 30,001 etc.
2) Use the Google Analytics Sheets Add-On to pull 10,000 rows.
3) Use Python to pull data 10,000 rows at a time via the Google Analytics API. Here is a solution that will let you pull over 1 million rows unsampled from Google Analytics using Python.
4) Use an Excel Connector tool like Analytics Edge to pull up to 1 million rows of data from Google Analytics.
As a fairly frequent GA user, I can confirm that the replacement of "D500" to "D100000" in the URL used to work fine for me, minus the occasional browser crashing, but now now matter what I change that value to, I still can only get a max of 5000 rows. Not sure when specifically, but I'm pretty sure I was able to use the workaround ~6 months ago.
Very handy way to do this (from reddit)
Just recreate the report in Google Data Studio, hover over the table
in View mode (not edit), click the 3 button menu and 'export as CSV'.
It automatically fetched all the rows.
Google were recently pushing Premium Analytics at me, but at $150,000 per annum it's a non-starter. But I suspect, like the curse of data sampling (there's no sampling with Premium), they are endeavouring to make the free version slightly less usable by quietly putting a block on that old export modifier trick. It used to work fine for me, but no more.
I'm using Google Analytics API to get the number of page views for each page of my website. In order to reduce the number of api calls, I'm setting an interval for doing this and cache the data on my server. For each api call, I try to get the page views of every page on my site and update them to my database.
Is there a way to get only CHANGED DATA from a specific time stamp? For example, only page views that changed within last 2 hours.
I think it would be a kind of filters (if any) but I could not find it from the documentation here https://developers.google.com/analytics/devguides/reporting/core/v3/reference#filters
You could add a filter for ga:dateHour so that it comes back in the last two hours. But the problem is that it takes Google around 4 hours to process the data. So you wouldn't get anything back for two hours ago.
If you want to see data that is that new you have to use the Realtime api https://developers.google.com/analytics/devguides/reporting/realtime/v3/
What exactly is your query currently? If you do ga:date, ga:dateHour, ga:pagepath, ga:pageviews The results will all be returned in one query (not counting next pages), thats a log way to the 10,000 queries per day limit.
On a side note. What do you mean by changed? Nothing is going to change in data previously processed.
I try to pull out the (unique) visitor count for a certain directory using three different methods:
* with a profile
* using an dynamic advanced segment
* using custom report filter
On a smaller site the three methods give the same result. But on the large site (> 5M visits/month) I get a big discrepancy between the profile on one hand and the advanced segment and filter on the other. This might be because of sampling - but the difference is smaller when it comes to pageviews. Is the estimation of visitors worse and the discrepancy bigger when using sampled data? Also when extracting data from the API (using filters or profiles) I still get DIFFERENT data even if GA doesn't indicate that the data is sampled - ie I'm looking at unsampled data.
Another strange thing is that the pageviews are higher in the profile than the filter, while the visitor count is higher for the filter vs the profile. I also applied a filter at the profile to force it to use sample data - and I again get quite similar results to the filter and segment-data.
profile filter segment filter#profile
unique 25550 37778 36433 37971
pageviews 202761 184130 n/a 202761
What I am trying to achieve is to find a way to get somewhat accurat data on unique visitors when I've run out of profiles to use.
More data with discrepancies can be found in this google docs: https://docs.google.com/spreadsheet/ccc?key=0Aqzq0UJQNY0XdG1DRFpaeWJveWhhdXZRemRlZ3pFb0E
Google Analytics (free version) tracks only 10 mio page interactions [0] (pageviews and events, any tracker method that start with "track" is an interaction) per month [1], so presumably the data for your larger site is already heavily sampled (I guess each of you 5 Million visitors has more than two interactions) [2]. Ad hoc reports use only 1 mio datapoints at max, so you have a sample of a sample. Naturally aggregated values suffer more from smaller sample sizes.
And I'm pretty sure the data limits apply to api access too (Google says that there is "no assurance that the excess hits will be processed"), so for the large site the api returns sampled (or incomplete) data, too - so you cannot really be looking at unsampled data.
As for the differences, I'd say that different ad hoc report use different samples so you end up with different results. With GA you shouldn't rely too much an absolute numbers anyway and look more for general trends.
[1] Analytics Premium tracks 50 mio interactions per month (and has support from Google) but comes at 150 000 USD per year
[2] Google suggests to use "_setSampleRate()" on large sites to make sure you have actually sampled data for each day of the month instead of random hit or miss after you exceed the data limits.
Data limits:
http://support.google.com/analytics/bin/answer.py?hl=en&answer=1070983).
setSampleRate:
https://developers.google.com/analytics/devguides/collection/gajs/methods/gaJSApiBasicConfiguration#_gat.GA_Tracker_._setSampleRate
Yes, the sampled data is less accurate, especially with visitor counts.
I've also seen them miss 500k pageviews over two days, only to see them appear in their reporting a few days later. It also doesn't surprise me to see different results from different interfaces. The quality of Google Analytics has diminished, even as they have tried to become more real-time. It appears that their codebase is inconsistent across API's, and their algorithms are all over the map.
I usually stick with the same metrics and reporting methods, so that my results remain comparable to one another. I also run GA in tandem with Gaug.es, as a validation and sanity check. With that extra data, I choose the reporting method in GA that I am most confident with and I rely on that exclusively.