How to get only CHANGED data from Google Analytics API? - google-analytics

I'm using Google Analytics API to get the number of page views for each page of my website. In order to reduce the number of api calls, I'm setting an interval for doing this and cache the data on my server. For each api call, I try to get the page views of every page on my site and update them to my database.
Is there a way to get only CHANGED DATA from a specific time stamp? For example, only page views that changed within last 2 hours.
I think it would be a kind of filters (if any) but I could not find it from the documentation here https://developers.google.com/analytics/devguides/reporting/core/v3/reference#filters

You could add a filter for ga:dateHour so that it comes back in the last two hours. But the problem is that it takes Google around 4 hours to process the data. So you wouldn't get anything back for two hours ago.
If you want to see data that is that new you have to use the Realtime api https://developers.google.com/analytics/devguides/reporting/realtime/v3/
What exactly is your query currently? If you do ga:date, ga:dateHour, ga:pagepath, ga:pageviews The results will all be returned in one query (not counting next pages), thats a log way to the 10,000 queries per day limit.
On a side note. What do you mean by changed? Nothing is going to change in data previously processed.

Related

How to emigrate old statistics to google analytics?

In our project we stored all users event data in our database for over one year , but it's not indexed.
now we are going to use google analytics to store our analytics and analyze the report using google analytics dashboard.
but before start using google analytics , i would like to emigrate all old statics (about 2 million events) to google analytics.
for this matter i should use Measurement Protocol and it's limit allow me to transfer 2 million hits with no problem.
but i didn't succeed to know how to set the time of the event. Measurement Protocol has Queue Time but google says :
Values greater than four hours may lead to hits not being processed.
how it's possible to transfer 2 million events to google analytics with there event time ?
Thanks
You are correct you can use the measurement protocol to send events data directly to google analytics. I don't see any problem in sending 2 million events. However its not possible to set the event time longer then four hours ago.
Queue time is used to set the time that the event occurred as you can see it cant be more then four hours ago and I have found that if you do set it to four hours ago its a bit fuzzy if the data is correct or not. This feature is probably most use in mobile devices where they may go off line for a short time you can store the data then send it all once the device is online again.
So the dates will be the date that you sent the event to Google Analytics you cant back date the data to more then four hours ago. So I am not sure how much use the data will be to you when it is all inserted.
There is no way to do this, but you can make it easier on yourself.
Unfortunately, there is no way to add, remove, or otherwise edit Google Analytics hit data retrospectively, except to delete all of it. You also cannot copy, or move it between accounts, or download it all.
You are not the first to have to come to terms with this.
In this situation, we recommend to our clients that they run their new and old systems in parallel for a testing period (usually 6 months or a year), before switching off one of them.
Yes, it's difficult to let go of old data, but sometimes it has to be done.

Google Calendar synchronization spec

I have a question regarding "Add Calendar By URL" function in Google Calendar:
How often it is updated (most sources I've found says 24h per day). Does caladress.ics?noCache workaround still works?
How it is updated? If I have a large calendar (e.g 2008 - 2016) and add a single event, does Calendar reupload the whole calendar or check for diff? If check for diff, is there any limitations?
Is there any limit to how long events could be? E.g is it possible to set 5 year event?
1. How often it is updated (most sources I've found says 24h per day).Does caladress.ics?noCache workaround still works?
Based from the Google thread, updates may take a few hours for the new information to be parsed and viewable by your users.
Note: It might take up to 12 hours for changes to show in your Google Calendar.
You can use no-cache to indicate that the returned response cannot be used to satisfy a subsequent request to the same URL without first checking with the server if the response has changed. Here is the documentation and example.
2. How it is updated? If I have a large calendar (e.g 2008 - 2016) and add a single event, does Calendar reupload the whole calendar or check for diff? If check for diff,is there any limitations?
Calendar is updated based on how you will implement the "incremental synchronization" of calendar data. It can be Initial full sync or Incremental sync.
Initial full sync is performed once at the very beginning in order to fully synchronize the client’s state with the server’s state. You can optionally restrict the list request using request parameters if you only want to synchronize a specific subset of resources.
While Incremental sync allows you to retrieve all the resources that have been modified since the last sync request. You need to perform a list request with your most recent sync token specified in the syncToken field. Keep in mind that the result will always contain deleted entries, so that the clients get the chance to remove them from storage.
3. Is there any limit to how long events could be? E.g is it possible to set 5 year event?
For the limitation, the Google Calendar API has a courtesy limit of 1,000,000 queries per day. You can see the calendar usage limits here. It is possible to set an event as long as you haven't reached the limit for the number of events you can create.

Google Analytics list of page views with timestamps?

Is it possible to export a list of all page views, with timestamps, out of google analytics?
Currently I can see page views per day, but not specific times for each view. I do see real-time data as it comes in, so I assume the time for each view is being stored.
If not, how might I go about sending that data when I report the page view in a way that will make it accessible in the dashboard?
[[UPDATE]] The Google Analytics v4 API now includes a "dateHourMinute" dimension.
Inside Google Analytics, I think the problem you will run into is that a "timestamp" includes Date, Hour, Minute, Second (and maybe millisec) and you can only drill into one secondary dimension at a time (ie: just date). An alternative method is to use the query explorer or just use the Google Analytics API.
Ad Hoc Report with query explorer
You can use the Google Analytics API Explorer: https://ga-dev-tools.appspot.com/query-explorer/
It will allow you to select:
A date range
Metrics: (like Users, pageviews, etc)
Dimensions: pagePath, Date, Hour, Minute (can't go to second or further, as in a timestamp, though)
Use filters to get rid of some data.
If your website gets a decent amount of traffic (> 50,000 pageviews per month), you might want to grab it 1 month (or less) at a time to avoid hitting limits/sampling effects.
#doctaj solution works. Please just note that if you are using the query-explorer for example, "dateHourMinute" dimension is considered 3 dimensions and will burn 3 spots from your available 7 or 10 dimensions that you can query. So if you need more dimensions, you need to decide which is more important to you. For example, you may want to query for the date and hour dimensions separately to drop the minutes in favour of something else.

How can I pull data from Google Analytics to see the top pages visited from the current page?

I would like to create a small sidebar on each page of my website that contains related/popular pages with perhaps the top five pages users visit after reading the current page.
I could track and record user movements across the site myself and build the list that way, but as my site already uses Google Analytics and I know the data is there I'd rather access that if all possible.
The trouble is that I don't have the faintest idea whether it is possible or not.
Remember that the Google Analytics Reporting API is not real-time it can take between 24 - 48 hours for the data to finish processing and be in the API for you to request.
The Realtime Google Analytics api is real time but the data is only about 5 minutes old and its very limited to the dimensions and metrics you can request.
Quota, with either of those APIs you are limited to 10,000 requests per day per profile / view. I have no idea how many pages there are on your site or how may users are on your site but this could quickly blow out this NON extendable quota.
Options: Except that its not realtime data and use the reporting api every night run a request against the api get everything for two days ago then show your users on your site data that's two days old. Store the data in the database then you are showing them data on in your DB and wont have an issue with the quota as you only requested it once.
But this isn't exactly what you want as its not showing a users activity over the site. TBH I am not sure you can exactly use Google Analytics to track a user as the data is user non specific.
If you don't want to get involved with learning the API and develop this from the ground up, check out EmbeddedAnalytics (disclaimer: I created the service). We could provide such a widget.
You may find This Article useful. It provides the necessary query to find the "next page visited" using the page of interest as a filter. Ultimately your query would look like this:
https://www.googleapis.com/analytics/v3/data/ga?ids=ga%3Aabc&start-date=30daysAgo&end-date=yesterday&metrics=ga%3Apageviews&dimensions=ga%3ApreviousPagePath%2Cga%3AnextPagePath&sort=-ga%3Apageviews&filters=ga%3ApreviousPagePath%3D%40pricing
The query above will give you the "Next Page" along with pageviews assuming the "previous" page contains the word "pricing".
We could easily build such report widget for you:
You would insert a javascript source code snippet into your page. The javascript would pass the page url to our server and we would return the next "most popular pages visited".
The pages could be "linkified" so that someone could click the link to go to that page.
We already have caching mechanism in place. So each pageview would not require a new query to google (making it quicker and also staying away from the API quota that was mentioned above). For pages that are hardly ever looked at (e.g. less than once a week), we could make "on-demand" calls to get the statistics.
In my experience with the API, the lag in the API is only a couple hours. It may be longer for larger sites.
Please let me know if you are interested in such widget and I can work with you.

How to include custom segments in the list of segments when querying the Google Analytics API?

This may be a possible duplicate of this question, but according to all the Google Analytics documentation I really should be able to pull my list of custom segments.
Since I have a very large list of them, it would be suboptimal for me to manually copy the segment ids over one at a time.
I'm following this walk through. Steps to reproduce:
Create a custom segment using date of first session in your Google Analytics account.
Authorize the Google Analytics guide to access your Google Analytics account.
Try their on-page query tester, and inspect whether your custom segment is there.
One thing I've already ruled out was the user that created the segment. I've manually created a segment with the same user that I'm querying the API with and it still does not show. Is there a flag I need to set somewhere to include custom segments?
Edit:
It turns out that it will list some custom segments, but not ones created with date of first session, so this is a duplicate of this question, which means that there is a bug in the Google Analytics API.
There was a bug which is now fixed. So it is now possible to list the Date of Session Segments in the Google Analytics Management API by calling the segments.list() method.
So after days of trying to solve this one I've come to the conclusion that it cannot be done as asked.
There is, however, another way to do it. For every segment set up a daily (or weekly, etc) email report to a email as a TSV. In each email body specify the name of the segment so when you're consuming the emails you can know which segment the attached TSV is for. It doesn't look like the daily reports were designed with segments in mind, since non of the metadata included in the TSV mentions which segment it is for.
From there it's trivial. Connect to the email address using an IMAP client once a day and update the numbers.
Note that the daily email only contains the numbers for that day (not a specified range), so you'll need to first generate the report one time with the historical data to load in.
While hacky, one nice thing about this approach is that it keeps your reports in sync with your (faked through email) api code (provided you match the column headings in the TSV). So, if for example, a new filter is included into a report, the new daily fields will continue to update.
Unfortunately though, the past data won't be reflected in the change.
Obviously this isn't great, but if you are monitoring daily cohorts it's the best you've got if you need to stay with Google Analytics. I have raised this as a bug to the Google Analytics developers, but I haven't heard back as to whether or not they plan to fix it.

Resources