Hey guys so I have tried many different ways to do this query
query.list <- Init(start.date = "2016-09-19",
end.date = "2016-09-23",
dimensions = "ga:date,ga:hour,ga:minute,ga:country",
metrics = "ga:newUsers",
filters = "ga:source!=Emai, ga:country==United Kingdom",
max.results = 10000,
sort = "ga:date",
table.id = "ga:XXXX"
)
ga.query <- QueryBuilder(query.list)
ga.data2 <- data.table(GetReportData(ga.query, token, split_daywise = T )
)
I do not know why it doesn't filter the country. I have tried only filtering the country and it simply does not work, I'm sure is something real simple that I'm missing. But I have tried every recommendation in other questions and is still not working. If I take out the country filter it works and if I put it does not do anything.. just outputs the same data
Since you need to include AND operator, which is denoted by a semicolon ;. Moreover, you need to URL encode all the parameters in the filter, so the correct filter would be:
ga:source!=Email;ga:country%3D%3DUnited%20Kingdom
To all readers:
While building the query in R for core reporting, please make sure that all the values are URL encoded. If you feel any difficulty in encoding, you can build the query from Google's Query Explorer
Related
sorted <- m$find(sort = '{"forks_count": -1}',
limit = 10)
How do I select only 3 features such as name, age, and income after applying the sort on the forks_count?
In standard MongoDB nomenclature, this is referred to as projection. General information about that is found on this page in their documentation.
Taking a look at the Mongolite User Manual, it seems like that project uses the fields parameter to provide this functionality. Based on that documentation, it looks like you can change your query to something similar to the following to get the results that you want:
sorted <- m$find(sort = '{"forks_count": -1}',
fields = '{"name" : true, "age" : true, "income" : true}',
limit = 10)
Note that depending on exactly what format you want your results in, you may also want to suppress the _id value (eg make it false).
I have a database called "db" with a table called "company" which has a column named "name".
I am trying to look up a company name in db using the following query:
dbGetQuery(db, 'SELECT name,registered_address FROM company WHERE LOWER(name) LIKE LOWER("%APPLE%")')
This give me the following correct result:
name
1 Apple
My problem is that I have a bunch of companies to look up and their names are in the following data frame
df <- as.data.frame(c("apple", "microsoft","facebook"))
I have tried the following method to get the company name from my df and insert it into the query:
sqlcomp <- paste0("'SELECT name, ","registered_address FROM company WHERE LOWER(name) LIKE LOWER(",'"', df[1,1],'"', ")'")
dbGetQuery(db,sqlcomp)
However this gives me the following error:
tinyformat: Too many conversion specifiers in format string
I've tried several other methods but I cannot get it to work.
Any help would be appreciated.
this code should work
df <- as.data.frame(c("apple", "microsoft","facebook"))
comparer <- paste(paste0(" LOWER(name) LIKE LOWER('%",df[,1],"%')"),collapse=" OR ")
sqlcomp <- sprintf("SELECT name, registered_address FROM company WHERE %s",comparer)
dbGetQuery(db,sqlcomp)
Hope this helps you move on.
Please vote my solution if it is helpful.
Using paste to paste in data into a query is generally a bad idea, due to SQL injection (whether truly injection or just accidental spoiling of the query). It's also better to keep the query free of "raw data" because DBMSes tend to optimize a query once and reuse that optimized query every time it sees the same query; if you encode data in it, it's a new query each time, so the optimization is defeated.
It's generally better to use parameterized queries; see https://db.rstudio.com/best-practices/run-queries-safely/#parameterized-queries.
For you, I suggest the following:
df <- data.frame(names = c("apple", "microsoft","facebook"))
qmarks <- paste(rep("?", nrow(df)), collapse = ",")
qmarks
# [1] "?,?,?"
dbGetQuery(con, sprintf("select name, registered_address from company where lower(name) in (%s)", qmarks),
params = tolower(df$names))
This takes advantage of three things:
the SQL IN operator, which takes a list (vector in R) of values and conditions on "set membership";
optimized queries; if you subsequently run this query again (with three arguments), then it will reuse the query. (Granted, if you run with other than three companies, then it will have to reoptimize, so this is limited gain);
no need to deal with quoting/escaping your data values; for instance, if it is feasible that your company names might include single or double quotes (perhaps typos on user-entry), then adding the value to the query itself is either going to cause the query to fail, or you will have to jump through some hoops to ensure that all quotes are escaped properly for the DBMS to see it as the correct strings.
I am trying to get data on page categories from Analytics, but it seems that this variable does not exist, as looked on https://ga-dev-tools.appspot.com/dimensions-metrics-explorer/.
If you do know how to get this data, the rest of the question does not apply.
If this indeed does not exists, those are the steps that I follow to get "grouped pages" data:
In order to get page data, I use the dim_filter argument
I define all the pages I need in all.pages
The function (x) allows for the iteration over the different page names
The actual iteration is done with the map function where I pass all.pages through the function (x)
Then, I create a dataframe
Because the actual name of the page used in the regex is not in the dataframe, I add them with the mutate function
all.pages <- c("page.name.1","page.name.2","page.name.3")
pages.ga <- function (x) { x.pages <- dim_filter(dimension="pagePath",operator="REGEXP",expressions=x)
x.filter <- filter_clause_ga4(list(x.pages))
x.results <<- google_analytics(ga_id,
date_range = c("2019-09-01","2019-11-30"),
metrics = c("bounceRate","pageviews","avgTimeOnPage"),
dim_filters = x.filter,
max = -1)
}
list.pages.results <- map(all.pages,pages.ga)
df.pages <- dplyr::bind_rows(list.pages.results)
df.pages <- df.pages %>% mutate(page.name = all.pages)
This works when there is no dimension but it is rather impossible to do with even a single dimension because the number of rows will not be the same as the length of all.pages.
Would anyone know how to overcome this ?
Thanks in advance to all of you !
Best, D.
It exists under content group
ga:contentGroupXX
Where XX is an integer. To keep things a bit simpler, it would be easier if you go to query explorer and type in content group under dimensions. The integers you have available will populate and you can explore which integer is the one you seek.
I need to retrieve data from google analytics using R
I write the following code with GoogleAnalyticsR:
df <- google_analytics(viewId = my_id,
date_range=c(start,end),
metrics = c("pageViews"),
dimensions = "pagePath",
anti_sample = TRUE,
filtersExpression ="ga:pagePath==RisultatoRicerca?nomeCasa",
max=100000)
I need to set correctly the FiltersExpression parameters.
I 'd like to have data from pagePath that contains RisultatoRicerca?nomeCasa. This code returns me a dataframe with 0 rows, which i know it's impossible ( data from an e-commerce with more than ten thousand interaction per day). So i 've begun to think that my FiltersExpression is incorrect.
Thanks in advance
I managed to solve the problem using filtersExpression
filtersExpression = "ga:pagePath=#RisultatoRicerca?nomeCasa
this filter works on pagePath dimension and filter every path that contain RisultatoRicerca
I'm trying to run this query with rGoogleAnalytics but it's throwing the error
Error in ParseDataFeedJSON(GA.Data) :
code : 400 Reason : Invalid value 'ga:pagePath=~/companies/[0-9]{6,8};ga:pagePath!#reviews' for filters parameter
I'm trying to fetch pages matching the pattern /companies/ followed by 6-8 numbers and not containing reviews
query.list <- Init(start.date = "2016-01-01",
end.date = "2017-03-31",
dimensions = "ga:pagePath",
metrics = "ga:pageviews",
filters = "ga:pagePath=~\/companies\/[0-9]{6,9};ga:pagePath!#reviews",
max.results = 10000,
table.id = "ga:xxxxxx")
Thanks
It appears that the problem is with your use of {6,9}. Perhaps you can Try to url encode that part of your regular expression: %7B6%2C9%7D
Use the Query Explorer to play with your query until you find one that works with what you are trying to accomplish.
The documentation states URL-reserved characters — Characters such as & must be url-encoded