I need to retrieve data from google analytics using R
I write the following code with GoogleAnalyticsR:
df <- google_analytics(viewId = my_id,
date_range=c(start,end),
metrics = c("pageViews"),
dimensions = "pagePath",
anti_sample = TRUE,
filtersExpression ="ga:pagePath==RisultatoRicerca?nomeCasa",
max=100000)
I need to set correctly the FiltersExpression parameters.
I 'd like to have data from pagePath that contains RisultatoRicerca?nomeCasa. This code returns me a dataframe with 0 rows, which i know it's impossible ( data from an e-commerce with more than ten thousand interaction per day). So i 've begun to think that my FiltersExpression is incorrect.
Thanks in advance
I managed to solve the problem using filtersExpression
filtersExpression = "ga:pagePath=#RisultatoRicerca?nomeCasa
this filter works on pagePath dimension and filter every path that contain RisultatoRicerca
Related
sorted <- m$find(sort = '{"forks_count": -1}',
limit = 10)
How do I select only 3 features such as name, age, and income after applying the sort on the forks_count?
In standard MongoDB nomenclature, this is referred to as projection. General information about that is found on this page in their documentation.
Taking a look at the Mongolite User Manual, it seems like that project uses the fields parameter to provide this functionality. Based on that documentation, it looks like you can change your query to something similar to the following to get the results that you want:
sorted <- m$find(sort = '{"forks_count": -1}',
fields = '{"name" : true, "age" : true, "income" : true}',
limit = 10)
Note that depending on exactly what format you want your results in, you may also want to suppress the _id value (eg make it false).
I am trying to use search_fullarchive from the rtweet package on sandbox PREMIUM with these exact search operators park OR parks, lang:en and point_radius:[51.5047 0.1278 25mi]. I have tried the following
test2 <- search_fullarchive(q = "park OR parks lang:en point_radius:[51.5074 0.1278 25mi]", n = 100, fromDate = "202003150000", toDate = "202003172359", env = "research", parse = TRUE, token = ActiveTravel_token)
The returned test2 object is a tbl_df filtered only by park OR parks. I've checked here and as a sandbox PREMIUM user I should be able to filter by lang: and point_radius:
Could someone please help me get the filtering to also match the other two operators lang:en and point_radius:[51.5047 0.1278 25mi].
Thanks in advance!
Best wishes,
Irena
This should be as simple as wrapping the text in parentheses, with the whitespace acting as a logical AND for the other fields.
q = "(park OR parks) lang:en point_radius:[51.5074 0.1278 25mi]"
However, I've just tried this search and at the moment, it returns zero Tweets within that point radius over that date range. I substituted in another point radius (the Boulder, CO example from the Twitter API documentation, point_radius:[-105.27346517 40.01924738 10.0mi], and it successfully brought back Tweets that matched the search parameters.
As to finding very few tweets. The point radius-operator will only return tweets that were geotagged manually by the user at the time of the tweet, and then only within a small area of maximum 25 miles. Only a small fraction of tweets are geo-tagged. You will probably have more luck with the place: operator. It will also return tweets by people who have the "place" you search for, set in their profile.
I am trying to get data on page categories from Analytics, but it seems that this variable does not exist, as looked on https://ga-dev-tools.appspot.com/dimensions-metrics-explorer/.
If you do know how to get this data, the rest of the question does not apply.
If this indeed does not exists, those are the steps that I follow to get "grouped pages" data:
In order to get page data, I use the dim_filter argument
I define all the pages I need in all.pages
The function (x) allows for the iteration over the different page names
The actual iteration is done with the map function where I pass all.pages through the function (x)
Then, I create a dataframe
Because the actual name of the page used in the regex is not in the dataframe, I add them with the mutate function
all.pages <- c("page.name.1","page.name.2","page.name.3")
pages.ga <- function (x) { x.pages <- dim_filter(dimension="pagePath",operator="REGEXP",expressions=x)
x.filter <- filter_clause_ga4(list(x.pages))
x.results <<- google_analytics(ga_id,
date_range = c("2019-09-01","2019-11-30"),
metrics = c("bounceRate","pageviews","avgTimeOnPage"),
dim_filters = x.filter,
max = -1)
}
list.pages.results <- map(all.pages,pages.ga)
df.pages <- dplyr::bind_rows(list.pages.results)
df.pages <- df.pages %>% mutate(page.name = all.pages)
This works when there is no dimension but it is rather impossible to do with even a single dimension because the number of rows will not be the same as the length of all.pages.
Would anyone know how to overcome this ?
Thanks in advance to all of you !
Best, D.
It exists under content group
ga:contentGroupXX
Where XX is an integer. To keep things a bit simpler, it would be easier if you go to query explorer and type in content group under dimensions. The integers you have available will populate and you can explore which integer is the one you seek.
Hey guys so I have tried many different ways to do this query
query.list <- Init(start.date = "2016-09-19",
end.date = "2016-09-23",
dimensions = "ga:date,ga:hour,ga:minute,ga:country",
metrics = "ga:newUsers",
filters = "ga:source!=Emai, ga:country==United Kingdom",
max.results = 10000,
sort = "ga:date",
table.id = "ga:XXXX"
)
ga.query <- QueryBuilder(query.list)
ga.data2 <- data.table(GetReportData(ga.query, token, split_daywise = T )
)
I do not know why it doesn't filter the country. I have tried only filtering the country and it simply does not work, I'm sure is something real simple that I'm missing. But I have tried every recommendation in other questions and is still not working. If I take out the country filter it works and if I put it does not do anything.. just outputs the same data
Since you need to include AND operator, which is denoted by a semicolon ;. Moreover, you need to URL encode all the parameters in the filter, so the correct filter would be:
ga:source!=Email;ga:country%3D%3DUnited%20Kingdom
To all readers:
While building the query in R for core reporting, please make sure that all the values are URL encoded. If you feel any difficulty in encoding, you can build the query from Google's Query Explorer
I want to fetch data from Google Analytics With M (PowerQuery/PowerBI), but want to filter the dimension values on the fily, Let's Say my dimension is "Page" and I want "Pageviews" and "Unique Pageviews" measures, provided that the "Page" is following a regex filter (ex. ga:pagePath=~^.*?([0-9]{6,7}|mpg[0-9]{1,3}){1}\.html[/]?[^ ]*)
I could use "Table.SelectRows", but as M doesn't support regex, this filter should be passed to GA api directly. Here is what M generated for me:
let
Source = GoogleAnalytics.Accounts(),
#"1234567" = Source{[Id="1234567"]}[Data],
#"UA-987654-1" = #"1234567"{[Id="UA-52004541-1"]}[Data],
#"11111" = #"UA-987654-1"{[Id="1234567"]}[Data],
#"Added Items" = Cube.Transform(#"11111", {{Cube.AddAndExpandDimensionColumn, "ga:pagePath", {"ga:pagePath"}, {"Page"}}, {Cube.AddMeasureColumn, "Pageviews", "ga:pageviews"}, {Cube.AddMeasureColumn, "Unique Pageviews", "ga:uniquePageviews"}}),
#"Filtered Rows" = Table.SelectRows(#"Added Items", each Text.Contains([Page], "html"))
in
#"Filtered Rows"
Is there any possiblity to pass my regex filter to GA api in M?
Thanks for your interest!
We don't support regular expression filters today. With Text filters you can generally implement the same filter logic, but admittedly it wouldn't be as concise as your regex.
We welcome requests for new features at https://ideas.powerbi.com, so consider starting a thread there :)