RGoogleAnalytics Segment not Agreeing with GA Web Output - r

Using R and the 'RGoogleAnalytics' package I have run into an issue of when running a query and the query in R not lining up with the table I get with the same segment in the Google Analytics report. I am using Google's Query Explorer to obtain the segment 'gaid', but I have also tried using the segment definition that the query explorer provides.
query.list <- Init(
start.date = "2015-5-1",
end.date = "2015-5-31",
dimensions = "ga:date",
metrics = "ga:users,ga:sessions",
segments = "gaid::...",
max.results = 10000,
sort = "ga:date",
table.id = "...")
ga.query<- QueryBuilder(query.list)
ga.data <- GetReportData(ga.query,oauth_token,
split_daywise = T)
The code runs, gives me a data frame, but the data frame does not match up with the tables I have with the same segment definitions in Google Analytics interface.
The segment of interest excludes specific users (defined by a custom user dimension), and has several session restrictions. The R code is is including extra users and sessions for some reason.
I've looked into the data trying to find out what users and sessions are getting through, but I am not finding any consistent trends between what is getting by the filter. Is anyone else running into this issue? Any suggestions?

Related

Google Analytics Data API (GA4) in R - Google Ads clicks

I use R Studio to export the necessary data from the GA4 property. I wanted to collect data on Google Ads campaign name, number of sessions and Google Ads clicks assigned to the corresponding campaigns. The number of sessions downloads to me, but it shows 0 Google Ads clicks, although I can see in the GA4 property that the clicks are there. Have you also had this problem? I am using the publisherAdclicks metric:
metric in R Studio API
I have not found any other metric through which I can retrieve this data, so do you have any idea what the problem is?
My code looks like this:
googleAdsSessionsClick <- google_analytics_set(c("sessions", "publisherAdClicks"),
c("sessionCampaignName", "sessionSourceMedium"))
I use my own function:
google_analytics_set <- function(choosenMetrics = NULL, choosenDimentions = NULL) {ga_data(webPropertyId,
date_range = date,
metrics = choosenMetrics,
dimensions = choosenDimentions)}

Google analytics - metrics mismatch while exporting data via API with various set of dimensions

I am working on GA reporting metrics in Power BI via reporting API.
While I create a query with some very basic attributes like sessions and users, I get same values as I can see directly in google analytics dashboard.
but when I add more dimensions and attributes, say, user type, pageviews or gender etc, alingwith users and sessions, the value of users and sessions is inflated.
I have tried to go through various documentations, where I know there are some restrictions that not all dimensions and attributes can be put together, but in this case, GA has allowed me to add these basic attributes togehter but the results are not matching.
Is there any documentation to explain this behaviur, or has anyone experienced anything like this.
has this to do something to do with binning, though I would expect, even if the difference is due to different binnings on different counters, the difference should be a smaller value, not the ones I am getting, which is huge(multiple times of error ) not just few percent of error.
I have come across with this problem and the reason is because of a limit on Google Analytics Core Reporting API.
Sampling thresholds
Default reports are not subject to sampling.
Ad-hoc queries of your data are subject to the following general
thresholds for sampling:
Analytics Standard: 500k sessions at the property level for the date
range you are using
Analytics 360: 100M sessions at the view level for
the date range you are using
i.e. Once the data you are requesting is returning more than 500k sessions / rows of data in a query, Google Analytics will return sampling data but not exact data.
The way I work around with this limit is to break down the query into separate queries (to make sure the returned data is fewer than 500k rows) with a date filter (per year, month or day, depends on data volume) apply to each of it. Then append all the queries back into one.
Sample M code:
(year as number, month as number) =>
let
Source = GoogleAnalytics.Accounts(),
...,
#"Added Items" = Cube.Transform(#"...", {{Cube.AddAndExpandDimensionColumn, "ga:pagePath", {"ga:pagePath"}, {"Page"}}, {Cube.AddAndExpandDimensionColumn, "ga:pageDepth", {"ga:pageDepth"}, {"Page Depth"}}, {Cube.AddAndExpandDimensionColumn, "ga:pageTitle", {"ga:pageTitle"}, {"Page Title"}}, {Cube.AddAndExpandDimensionColumn, "ga:date", {"ga:date"}, {"Date"}}, {Cube.AddMeasureColumn, "Page Load Time (ms)", "ga:pageLoadTime"}}),
#"Filtered Rows" = Table.SelectRows(#"Added Items", each [Date] >= #date(year, month, 1) and [Date] <= Date.EndOfMonth(#date(year, month, 1)))
in
#"Filtered Rows"
Result:

Query multiple Google Analytics view ids using googleAnalyticsR v4 API package

I want to use the new googleAnalyticsR package to extract Google Analytics data using the v4 API.
The documentation (http://code.markedmondson.me/googleAnalyticsR/v4.html) demonstrates the execution of a query using one ga_id, but not using multiple view ids. There is another R package called GAR which permits the execution of multiple view id in a single Google Analytics query, but the googleAnalyticsR package includes v4 API features. I attempted to query multiple view ids using ga_id <- c('viewId','viewId'), but the query returns an error. Is there a way to query multiple view ids using googleAnalyticsR v4 API?
This probably isn't supported in API directly, but given you are using R, this could be very easily achieved using FOR loops. Below is an example where I am querying multiple GA views (1 view = 1 language version of the site):
viewId <- c(6006393, 79777098, 79781440, 79981805, 75315234, 78174757, 76630182, 79447058)
ga_data_final <- data.frame()
for (i in viewId) {
ga_data_temp <-
google_analytics_4(i, #=This is a (dynamic) ViewID parameter
date_range = c("2014-01-01",
"2016-08-15"),
metrics = c("sessions"),
dimensions = c("yearMonth",
"source",
"medium"),
max = -1)
ga_data_temp$viewId <- i
ga_data_final <- rbind(ga_data_final, ga_data_temp)
}
The code above retrieves:
1 metric: number of sessions
3 dimensions: yearMonth, Source, Medium
It's using 2 dataframes - the master one is created as empty before FOR loop starts. Every FOR cycle pulls rows for 1 view (temporarily stored in ga_data_temp) and once finished, appends them to the master dataframe (ga_data_final).
Hope this helps.

R: Google Analytics MCF - Filtering by products

I need to filter Google Analytics MCF report by two basic product categories we sell (let's call them type1 and type2).
My current API call via R package RGA is:
get_mcf(profileId = 'xxxxx',
start.date = '2016-10-01',
end.date = '2016-12-31',
metrics = 'mcf:totalConversions, mcf:totalConversionValue',
dimensions = 'mcf:basicChannelGroupingPath',
samplingLevel = 'HIGHER_PRECISION',
max.results = 10000,
token = 'xxxxx')
With this I can get MCF paths for all products.
What I need to do now is to add filter or segment, which would allow me to add filter for product name, something like
ga:productName=#type1
How can I do this?
Thanks
there is a limited number of dimensions and metrics that you can use with the MCF api. As far as I can see there is no mfc:productName dimension
You can find the full list of dimenisons and metrics you can use with this api here Dimensions & Metrics Reference

Google Analytics Query for R, content drilldown

I am trying to export Google Analytics data into R in order to build a report and do some other data mining related tasks with the data. I am using the RGoogleAnalytics package to do so. I have the connection working, but am having trouble specifying the correct query in order to obtain the right information.
I am trying to obtain information from a specific page that I would reach by going to the content drilldown section in google analytics, and searching for that specific page. I also would like to use a filtered view, to filter out ISP's that are from my work place. There are several websites under a particular view. To reach the specific page, I use the content drill down in Google Analytics. I am trying to build a query that pulls this information automatically. I have tried the following in regards to getting the correct query.
ValidateToken(token1)
query.list1 <- Init(start.date = "2016-10-28", end.date = "2016-12-05",
dimensions = "ga:date", metrics = "ga:uniquepageviews",
filters = "ga:pagePathLevel1
==/ed######.edu/;ga:pagePathLevel2==/content/",
table.id = "ga:##### ")
sort = "ga:date"
ga.query <- QueryBuilder(query.list1)
ga.data <- GetReportData(ga.query, token1)
This does not throw in error in R, but does not seem to be returning any metrics(it returns all zeros, for unique pageviews, when there are results) as shown below.
**date** **uniquepageviews**
1 20161028 0
2 20161029 0
3 20161030 0
4 20161031 0
In the above, I tried to use the filter to get the correct page. Is this correct? If so, what should I put into the filter so that it only returns metrics for a specific page in a given view? Also, is there a way to select for a given prebuilt view? Any help is appreciated, thanks.

Resources