Different Results In Exporting Data From Google Analytics 4 - google-analytics

I have tried using Google Analytics Data Api - and issue with it is that the numbers don't much up with what I get from api and from what I see on web ui of Google Analytics. I even tried query explorer for Google Analytics Data Api but got less results again.
From what I am assuming and come to know that google sampling applies on GA Data Api (Although I have results of a day in few hundreds only). Now I heard there is no option to control sampling in GA Data API but there was in GA Reporting API by controlling sampling size option. So I decided to use that. But to my surprise Reporting API only takes View ID and I only had Property ID because I was using GA4 (can't enable universal analytics).
This takes me down to what? I tried Management api and it is for controlling options but not to view data. I tried to import data into Looker and exported from there to myself. But Looker has same results I get from query explorer which are less than web ui of Google Analytics.
I want to do this programatically so just can't use manual export option of GA4.
Need help in finding best way to do it !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
I have done this same thing in google analytics data api, query explorer as well as ga web ui. Ga web ui shows the most results. While query explorer and google analytics data api shows the same after sampling. Attached screenshots for reference.
GA WEB UI _ QUERY
GA WEB UI _ RESULTS EXPORTED TO CSV
QUERY EXPLORER_ QUERY
QUERY EXPLORER_ RESULTS
GA4 DATA API_ RESULTS
EVEN TRIED 3rd PARTY SERVICE Supermetrics GOT SAME
Here is code for GAT DATA API:
`
from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
import pandas as pd
from collections import defaultdict
SCOPES = ['https://www.googleapis.com/auth/analytics.readonly']
KEY_FILE_LOCATION = 'MY_GA.json'
property_id = 'properties/266233569'
dimensions = ["date","platform","customEvent:order_id"]
metrics = ['sessions']
request = {
"requests": [
{
"dateRanges": [
{
"startDate": "2022-10-19",
"endDate": "2022-10-19"
}
],
"dimensions": [{'name': name} for name in dimensions],
"metrics": [{'name': name} for name in metrics],
"limit": 10000,
"return_property_quota": True,
"keep_empty_rows": True
}
]
}
credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES)
analytics = build('analyticsdata', 'v1beta',
credentials=credentials)
response =
analytics.properties().batchRunReports(property=property_id,
body=request).execute()
`

I believe it's a GA4 limitation. If you're considering third-party solutions, Able CDP has its own tracking that connects to BigQuery (so easily accessible in Looker) and provides raw tracking data without sampling.

Related

missing custom event in GA4 through Google Analytics Data API v4

We have a website with GA4 and want to set up data collection using Google Analytics Data API v4. We can see all custom events we defined show up Google Analytics UI under Engagement/Events. But when we call https://analyticsdata.googleapis.com/v1beta/ some of them are missing.
To be specific we are running the following to get list of all dimensions and metrics:
from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import GetMetadataRequest
property_id = "PROPERTY_ID"
client = BetaAnalyticsDataClient()
request = GetMetadataRequest(name=f"properties/{property_id}/metadata")
print(client.get_metadata(request=request))
Does anyone know why some events are not showing up with the above code?
Thank you!
EDIT:
I found that only events marked as conversion (Configure/Events) show up in the metadata. But why?
EDIT2:
According to https://developers.google.com/analytics/devguides/reporting/data/v1/advanced#event-scoped_custom_dimensions https://analyticsdata.googleapis.com/v1beta/properties/GA4_PROPERTY_ID/metadata only returns metrics and dimensions. So I guess I need to use something else to retrieve all events?

Google Analytics UserID API extraction

Trying to extract a userID from GoogleAnalytics. This is to view which user is the using the website most and least.
I would to retrieve the top 5 user IDs and bottom 5 user IDs that were passed using either:
gtag('config', 'GA_TRACKING_ID', {
'user_id': 'USER_ID'
});
OR
using a custom dimension... ( https://support.google.com/analytics/answer/2709828?hl=en )
I'm (vaguely) aware of policies and TOS to keep 'non identifying' information on Google BUT there are posts online indicating you can link back to CMS data.
Steps so far
Google Analytics with UserID and view setup - Working in Google dashboard and showing filtered userID and All website data using the idea.
Requirements:
Extract page view and session data for each userId between a date
range (or all by default)
UserID via standard GA method
UserID via Custom dimension method
Any help, pointers or examples how someone has completed something like this are appreciated.
NOTE: This is to PULL data out of GA and manipulate/display it on an external system/dashboard.
Seen this which states it's not possible: Google analytics userID tracking
and this which states it (kind of) is google analytics API implementation for tracking a specific user activities in php
The solution I used:
Tracking
Create Google Analytics account
Create a new view by activating the UserID tracking (labeled NewView1)
Use https://developers.google.com/analytics/devguides/collection/gtagjs/custom-dims-mets
i.e. Define your custom dimension
Get Analytics tracking code + Add custom definition code
Create a Custom report using the 'metrics' you want to see and filtering by the 'custom dimension' I created earlier.
(note: data took ~ 12 hours to be visible so don't expect to work instantly)
Front end tracking additions
gtag('config', 'GA_TRACKING_ID', {
'custom_map': {'dimension<Index>': 'dimension_name'}
});
// Sends the custom dimension to Google Analytics.
gtag('event', 'any_event_name', {'dimension_name': dimension_value});
Extraction
Create New Google Developer Console Project (API)
Use a Service Account to connect the API with Analytics ( https://developers.google.com/analytics/devguides/reporting/core/v4/quickstart/service-php ) -- API -> credentials -> Create Credentials (Service Account key)
(assign role [mine was set to Project => View])
Save the Text file to your filesystem and rename to json as per examples
Add permissions to your Analytics project by copy/pasting the 'Email' from the Service account details into Analytics User Management.
Get the view ID of the data you wish to extract (Analtyics created in step 2 in tracking)
Use the sample Code (HelloAnalytics.php) to connect and extract data
Use your custom dimension to filter results
The dimension filter I used was $dimensions (see below)
...
$dimensions = new \Google_Service_AnalyticsReporting_Dimension();
$dimensions->setName('ga:dimension1'); // as per docs
// Create the ReportRequest object.
$request = new \Google_Service_AnalyticsReporting_ReportRequest();
$request->setViewId($VIEW_ID);
$request->setDateRanges($dateRange);
$request->setMetrics(array($sessions, $pageviews));
$request->setDimensions($dimensions);
From there I was able to see the same data via API that I could see in the custom report on analytics web.
NOTE: be careful of which Google project and view you're in while setting up
permissions and dimensions.
NOTE: using gtag() code and not Universal (ga()) js code
The answer is a very brief/rough summary of how I achieved my specific goal. It is not a one-size-fits all solution but hopefully it will give someone a better idea of how to set and extract custom variable data within Google.
The final result was data from the API. From there it's up to you.

Why data i get from google analytics API not match with google analytics web interface?

I used Google API to get data from google analytics, but the metrics not the same with the web interface of google analytics.
ie: I get data on 2015-03-01 - It return pageviews 79
But on web interface of google analytics, it is 80.
I had searched on some question the same me, but almost them show the way to solve is Sampling level.
I tried to set other Sampling level
DataResource.GaResource.GetRequest request = Service.Data.Ga.Get(profileId, startDate.ToString("yyyy-MM-dd"),
endDate.ToString("yyyy-MM-dd"), string.Join(",", metrics));
if (dimensions != null)
{
request.Dimensions = string.Join(",", dimensions);
}
request.SamplingLevel = DataResource.GaResource.GetRequest.SamplingLevelEnum.HIGHERPRECISION;
request.StartIndex = startIndex;
return request;
after that, the result return the same before, it not change.
So, anyone know this issue?
Simple its sampled data vs. unsampled data which you can read about here: https://support.google.com/analytics/answer/1042498?hl=en
For API work i normally use a web query explorer to verify that my API call's are being sent and responses match to verify the data: https://ga-dev-tools.appspot.com/explorer/

Google Analytics: Real Time Reporting API and Filtering by Custom Dimension

Any ideas how i would filter the response based on a custom dimension when making an api call to the real time reporting api?
Below is what i have tried setting the filter to in the api call, my custom dimension is called dimension1.
filters=rt:dimension1==test
filters=ga:dimension1==test
Both the above filter do not work for the real time api.
But it seems i can use the following filter when doing an api call to the Core Reporting Api
filters=ga:dimension1==test
So it looks like the real time API doesn't support custom dimensions
List of supported dimensions and metrics below for the real time api.
https://developers.google.com/analytics/devguides/reporting/realtime/dimsmets/
You should be able to use filters=rt:dimension1==test as long as you are putting the dimension in your call. I'm not sure how you are making the call to the API. I'm using the javascript library and sending in JSON like this:
'params': {'ids': 'ga:xxxxxx', 'metrics': 'rt:totalEvents', 'dimensions': [ 'rt:eventCategory', 'rt:eventAction', 'rt:eventLabel', 'rt:minutesAgo'], 'filters': 'rt:eventLabel==Submit', 'fields': 'rows' }
Hope this helps!

What is the best method to measure site visits and page views in real time?

I currently use Adobe Omniture SiteCatalyst, Google Analytics, and New Relic. All three offer visit and page view metrics. SiteCatalyst has no API that I'm aware of, and their data is often hours behind. Google Analytics and New Relic both offer realtime APIs, but I find that the metrics offered differ wildly across vendors.
What's the best method (API) for measuring realtime visits (page views, unique visitors, etc.)?
Ultimately, I intend to use this data to present realtime conversion rates to my business customers.
Adobe SiteCatalyst does have a realtime api that you can use. It functions in a similar way that reports in SiteCatalyst work.
Here is python example request:
import requests
import sha
import binascii
import time
your_report_suite="ReportSuiteId" #The name of the report suite
what_you_are_looking = "someValue" #value of a the prop that you want to find in the realtime stream
def getRealTimeUsers():
if mobile:
url = 'https://api.omniture.com/admin/1.3/rest/?method='
headers = {'X-WSSE': self.generateHeader()}
method = 'Report.GetRealTimeReport'
report_url = url + method
payload = {
"reportDescription": {
"reportSuiteID": your_report_suite,
"metrics": [
{
"id": "instances"
}
],
"elements": [
{
"id": "prop4",
"search": {
"type": "string",
"keywords": what_you_are_looking
}
}
]
}
}
response = requests.post(url=report_url, headers=headers, data=json.dumps(payload))
data = response.json().get('report').get('data')
def generateHeader():
# Generates the SC headers for the request
nonce = str(time.time())
base64nonce = binascii.b2a_base64(binascii.a2b_qp(nonce))
created_date = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.localtime())
sha_object = sha.new(nonce + created_date + self.sc_key)
password_64 = binascii.b2a_base64(sha_object.digest())
return 'UsernameToken Username="%s", PasswordDigest="%s", Nonce="%s", Created="%s"' % (
self.sc_user, password_64.strip(), base64nonce.strip(), created_date)
Note: Realtime reporting requires that the realtime feature is turned on in your report suite. Also the realtime reports are limited in their dimensionality. There is not a whole lot of documentation on the particular requests required but there is this: https://marketing.adobe.com/developer/documentation/sitecatalyst-reporting/c-real-time
Also I highly recommend experimentation by using the api explorer: https://marketing.adobe.com/developer/api-explorer#Report.GetRealTimeReport
What kind of delay is acceptable? What about accuracy and detail? Script-based systems like Google Analytics require Javascript to be enabled and provide plenty of details about the visitor's demographic and technical information, but raw webserver logfiles give you details about every single request (which is better for technical insight, as you get details on requested images, hotlinking, referrers and other files).
Personally, I'd just use Google Analytics because I'm familar with it, and also because their CDN servers mean that my site won't load slowly; but otherwise I just run typical logfile analysis software on my raw webserver logs, however depending on your software this file analysis can take time to generate a report.

Resources