Returned values of the same Core Reporting API metrics are varied with combinations of specifed metrics - google-analytics

I want to fetch some metrics values like goalXXCompletions and goalXXConversionRate via Core Reporting API, along with region and regionId dimensions.
However, when I also specify either users or uniquePageviews metrics, the queries' returned values are unexpectedly changed even if they have the same dimensions.
After some investigation, I have found:
Returned values are changed when specifing users and uniquePageviews
Returned values are NOT changed when specifing sessions and pageviews
from the results which specify only goalXXCompletions and goalXXConversionRate.
Can anyone please let me know why returned values happen to vary with target metrics combinations?
I had tried both Query Explorer and HTTP API requests, but results were same.
UPDATE
Added an example of a raw response. These values of each region/regionId are varied with or without specifying users/uniquePageviews.
{
"totalResults": 28,
"containsSampledData": true,
"columnHeaders": [
{
"name": "ga:regionId"
},
{
"name": "ga:region"
},
{
"name": "ga:users"
},
{
"name": "ga:goal1ConversionRate"
}
],
"totalsForAllResults": {
"ga:users": "1125",
"ga:goal1ConversionRate": "7.155222158107328"
},
"rows": [
[
"(not set)",
"(not set)",
"2",
"0.0"
],
[
"21152",
"Massachusetts",
"358",
"20.930232558139537"
],
[
"21176",
"Texas",
"463",
"2.9797377830750893"
]
]
}
The above is not exactly the actual response, but almost the same except the count of the rows.

As Matt mensioned in the comment, I found that this is because of the usage of sampled data.

Related

Why Google Analytics "minute" dimension doesn't work?

I try to execute the following request usign Google Reporting API:
"reportRequests": [
{
"metrics": [
{
"expression": "ga:sessions"
}
],
"dimensions": [
{
"name": "ga:date"
},
{
"name": "ga:minute"
}
],
"dateRanges": [
{
"startDate": "2018-01-02",
"endDate": "2018-04-17"
}
],
"viewId": "validViewId",
"hideValueRanges": true,
"hideTotals": true
}
]
Unfortunately I get no data. But when I use only "date" dimension, without "minute", I get some data in response.
Could someone please explain such a weird behavior?
As you are querying historical data (here > 2 years) the level of granularity of your data would be impacted by the retention period defined on property level. Meaning that beyond the retention period you can only query a combination of dimensions that are covered by Google Analytics standard aggregated tables (GA Help Center)
In your case the combination "Session | Date | Minute" is not part of Google Analytics aggregated tables, however "Session | Date" is.
You can recreate this phenomenon by using your query on different time ranges: within the retention period vs. outside the retention period.

Google Analytics API Revenue Discrepances

I have discrepancies in the revenue metric, between the data I collect from the Google Analytics API and the custom reports in the user interface.
The discrepancies for each value maintain the same rate, where the data collected through the API is greater than the data in the custom reports.
This is the body of the request I'm using:
{
"reportRequests":[
{
"viewId":"xxxxxxxxxx",
"dateRanges": [{"startDate":"2017-07-01","endDate":"2018-12-31"}],
"metrics": [
{"expression": "ga:transactionRevenue","alias": "transactionRevenue","formattingType": "CURRENCY"},
{"expression": "ga:itemRevenue","alias": "itemRevenue","formattingType": "CURRENCY"},
{"expression": "ga:productRevenuePerPurchase","alias": "productRevenuePerPurchase","formattingType": "CURRENCY"}
],
"dimensions": [
{"name": "ga:channelGrouping"},
{"name": "ga:sourceMedium"},
{"name": "ga:dateHour"},
{"name": "ga:transactionId"},
{"name": "ga:keyWord"}
],
"pageSize": "10000"
}]}
This is an extract of the response:
{{
"reports": [
{
"columnHeader": {
"dimensions": [
"ga:channelGrouping",
"ga:sourceMedium",
"ga:dateHour",
"ga:transactionId",
"ga:keyWord"
],
"metricHeader": {
"metricHeaderEntries": [
{
"name": "transactionRevenue",
"type": "CURRENCY"
},
{
"name": "itemRevenue",
"type": "CURRENCY"
},
{
"name": "productRevenuePerPurchase",
"type": "CURRENCY"
}
]
}
},
"data": {
"rows": [
{
"dimensions": [
"(Other)",
"bing / (not set)",
"2018052216",
"834042319461-01",
"(not set)"
],
"metrics": [
{
"values": [
"367.675436",
"316.55053699999996",
"316.55053699999996"
]
}
]
},
...
So, if I create a custom report in the Google Analytics user interface and look for the transaction ID 834042319461-01, I get the following result:
google Analytics custom report filtered by transaction id 834042319461-01
In the end I have a revenue value of 367.675436 in the API response, but a value of 333.12 in the custom report, its a 10.37% more in the value of the API. I get this 10.37% increase for all values.
¿Why I'm having these discrepance?
¿What would you recomend to do in order to solve these problem?
Thanks.
My bet is that you're experiencing sampling (is your time range in the UI lower than in the API?): https://support.google.com/analytics/answer/2637192?hl=en
Sampling applies when:
you customize the reports
the number of sessions for the overall time range of the report (whether or not your query returns less sessions) exceeds 500K (GA) or 100M (GA 360)
The consequence is that:
the report will be based on a subset of the data (the % depends on the total number of sessions)
therefore your report data won't be as accurate as usual
What you can do to reduce sampling:
increase sample size (will only decrease sampling to a certain extend, but in most cases won't completely remove sampling). In UI it's done via the option at the top of the report, in the API it's done using the samplingLevel option
reduce time range
create filtered views so your reports contain the data you need without needed to customize reports
Because you are looking at a particular transaction ID, this might not be a sampling issue.
If the ratio is consistent, from your question it seems to be 10.37%. I believe this is the case of currency that you are using.
Try using local currency metric API calls when making monetary based calls.
For example -
ga:localTransactionRevenue instead of ga:transactionRevenue

Google Analytics calculated metric not available from core reporting API

Here's how the calculated metric is defined in the Google Analytics web interface:
Name
Profit
External Name
calcMetric_Profit
Formatting Type
Currency (Decimal)
Formula
{{Revenue}} - {{Shipping}} - {{Tax}}
The "External Name" field has this description:
The external name is used to uniquely identify the calculated metric when querying via API. External names may contain alphanumeric and underscore characters only. Special characters, symbols, and spaces are not allowed.
When I request the metric from v4 of the reporting API, using this request:
POST https://analyticsreporting.googleapis.com/v4/reports:batchGet?key=<redacted>
{
"reportRequests": [
{
"dateRanges": [
{
"startDate": "2018-01-01",
"endDate": "2018-01-22"
}
],
"metrics": [
{
"expression": "ga:calcMetric_Profit"
}
],
"viewId": "<redacted>"
}
]
}
I get this response:
400
{
"error": {
"code": 400,
"message": "Invalid value 'ga:calcMetric_Profit' for metric parameter.",
"status": "INVALID_ARGUMENT"
}
}
Am I doing something wrong?
Is this related to calculated metrics still being in Beta?
Change your startDate to one day after you created the calculated metric. Calculated metrics apply from the date they were created (there is no retrofilling). So if your startDate is before the date of creation, the API will throw an error, as the calculated metric doesn't exist for the dates prior to it being created.

Adobe Analytics API - Real Time Classification

I need to get from Omniture real time API a classify eVar, exclude some value, and then breackdown its with sitesection.
I try with this query:
{
"reportDescription": {
"source": "realtime",
"reportSuiteID": "**RSID**", //MY REPORT SUITE
"metrics": [{
"id": "instances"
}],
"elements": [{
"id": "evar", //MY EVAR
"top": 100,
"classification": "Real Time", //CLASSIFICATION NAME
"search": {
"type": "NOT",
"keywords": ["somevalue"] //THE VALUE TO EXCLUDE
}
},{
"id" : "sitesection",
"top" : 1
}],
"dateGranularity": "minute:1",
"dateFrom": "-1 minute"
}
}
But in the JSON response I see "somevalue" how if it not excluded.
The strange thing is that if I remove the "breakdown" (with sitesection) the classification filter seems to works fine.
I can't use classification filter if a breackdown is used in real time report? I can't find any documentation about that.
An other thing is that if I request a report with the classification, without any search, I receve the response but there is a lot of "::Unspecified::". The problem is that the "::Unspecified::" seems to be the last datas that Omniture receves form my webpages. I think this means that classifications are not in real time, also if you can to use it in real time report.

GA API differences with UI

Using segments, I am building an app that lets user enter a URL and check different metrics for it.
I see some significant differences between reports pulled with the API and report generated in the UI. Maybe I am misunderstanding something about segments.
For example, I have a segment designed to show only users that went on a specific page.
which map to a small fraction of my users:
When looking at the user age brackets, I see small numbers scattered across all categories.
Now to run the equivalent report in the API, I am using the payload below.
{
"reportRequests": [
{
"viewId": "#####",
"dateRanges": [
{ "startDate": "2017-03-01",
"endDate": "2017-04-27" }
],
"metrics": [
{"expression": "ga:pageviews"},
{"expression": "ga:sessions"},
{"expression": "ga:users"}
],
"segments": [
{
"dynamicSegment": {
"name": "Users of /apath/ofinterest/",
"userSegment": {
"segmentFilters": [
{
"simpleSegment": {
"orFiltersForSegment": {
"segmentFilterClauses": [
{
"dimensionFilter": {
"dimensionName": "ga:pagePath",
"operator": "EXACT",
"expressions": [
"/apath/ofinterest/"
]}}]}}}]}}}
],
"dimensions": [
{ "name": "ga:userAgeBracket" },
{ "name": "ga:segment" }
]
}
]
}
Which yield completely different results:
Only 2 age brackets, and weirdly, the same number of users in each (I tried with different time frame with the same behaviour).
Any ideas on what could be wrong? Could it be something in the settings of the segment? Related to "Sessions/User Include"?
Or could this warning below that I see in the UI have different impact in the UI and the API?
According to this comment, it sounds like numbers might be calculated differently for the API and the UI. Is that still the case?
Thanks a lot!
GA UI data is picked from the pre-aggregated tables.
Pre-calculated data -- pre-aggregated tables
These are the precalculated data that Google uses to speed up the UI. Google does not specify when this is done but it can be at any point of the time. These are known as pre-aggregated tables
So if you compare the numbers from GA UI to your GA API data, you will always see a discrepancy because the view might be aggregated at some early stages of the day, while your api data is fresh

Resources