Character string truncated when passed to function - r

I have a long list of Campaign names that I need to collapse to a character vector of length 1 and then pass as the "where" clause in a call to the Google AdWords API through the "RAdwords" package.
Creating this character string is not a problem until its length gets to a certain point that the values are truncated, which causes an error in AdWords API call.
Here is a sample of the setup that will not cause an error:
campaigns <- paste0("Campaign ", seq(1,5))
collapsed_campaigns <- paste0(campaigns, collapse = "','")
campaign_filter1 <- paste("CampaignName IN ['", collapsed_campaigns, "']")
And here is a setup that will cause an error:
campaigns <- paste0("Campaign ", seq(1,50))
collapsed_campaigns <- paste0(campaigns, collapse = "','")
campaign_filter2 <- paste("CampaignName IN ['", collapsed_campaigns, "']")
Inspecting the structure of each variable shows:
> str(campaign_filter1)
chr "CampaignName IN [' Campaign 1','Campaign 2','Campaign 3',
'Campaign 4','Campaign 5 ']"
> str(campaign_filter2)
chr "CampaignName IN [' Campaign 1','Campaign 2','Campaign 3',
'Campaign 4','Campaign 5','Campaign 6','Campaign 7','Campaign 8','Camp"| __truncated__
If I pass 'campaign_filter1' as my where clause in RAdwords, things run as expected.
If I pass 'campaign_filter2' as the where clause, I get this error:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><reportDownloadError>
<ApiError><type>QueryError.INVALID_WHERE_CLAUSE</type><trigger></trigger>
<fieldPath></fieldPath></ApiError></reportDownloadError>
It seems the "| truncated" is getting passed literally to the RAdwords function.
Here is the result of inspecting the structure of "traffic_data" in a failed call to RAdwords:
> str(traffic_data)
Classes ‘data.table’ and 'data.frame': 1 obs. of 1 variable:
$ ads: chr "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<reportDownloadError><ApiError><type>QueryError.INVALID_WHERE_CLAU"| __truncated__
- attr(*, ".internal.selfref")=<externalptr>
Obviously, I could get around this some sort of looping function and call the data from the API one campaign at a time, but that would be horribly inefficient. How can I get the entirety of the character string to be passed to RAdwords?

One question upfront: Why don´t you donwload all campaign data and filter the result dataframe in R? With this strategy the whole string campaign name pasting process would become superfluous. You could filter the dataframe based on vector operations in R. This approach probably is more rubust and less vulnerable.
However, if you want to filter campaigns explicitly in your API call you can do it with this code:
# 1. Download all campaigns
# query all campaign names
body1 <- statement(select=c('CampaignName'),
report="CAMPAIGN_PERFORMANCE_REPORT",
start="2017-11-01",
end="2017-11-02")
# download all campaign names
campaigns <- getData(clientCustomerId = "***-***-****",
google_auth = google_auth,
statement = body,
apiVersion = "201710",
transformation = T,
changeNames = T)
# 2. Build query with all campaigns in where clause
# build string for where clause
cmp_string <- paste0(campaigns$Campaign, collapse = "','")
cmp_string <- paste("CampaignName IN ['", cmp_string, "']", sep = "")
# query all campaigns with where condition
body2 <- statement(select = c('CampaignName'),
where = cmp_string,
report = "CAMPAIGN_PERFORMANCE_REPORT",
start = "2017-11-01",
end = "2017-11-02")
# download all campaigns using the where clause
campaigns2 <- getData(clientCustomerId = "***-***-****",
google_auth = google_auth,
statement = body,
apiVersion = "201710",
transformation = T,
changeNames = T)
In the first part I download all campaign names to have data for the where clause. In the second part I demonstrate how to download all campaigns again utilizing the where clause with all campaigns as filter.
I tested the code above with over 200 campaigns. Neither there were any issues with the RAdwords package nor with the Adwords API.
I suspect there are issues with the string you pass into campaign_filter2. Within paste() you miss to set sep = "". Otherwise you end up with a space in the beginning of the first campaign name.

Related

Why am I having a problem downloading patent data with "patentsview" in R

I am trying to fetch patent data with the "patentsview" package in R but I am always getting an error and I couldn't find the solution anywhere. Here's my code -
# Load library
library(patentsview)
# Write query
query <- with_qfuns(
and(
begins(cpc_subgroup_id = 'G06N'),
gte(patent_year = 2020)
)
)
# Create a list of fields
# get_fields(endpoint = "patents")
# Needed Fields
fields <- c(
"patent_id",
"patent_title",
"patent_abstract",
"patent_date"
)
# Send an HTTP request to the PatentsView API to get the data
pv_res <- search_pv(query = query, fields = fields, all_pages = TRUE)
The output is -
Error in xheader_er_or_status(resp) : Not Found (HTTP 404).
What am I doing wrong here? And what is the solution?

Get full list of Adwords MCC with R

I need to get the list of all MCC with an Adwords account via Google API and R.
So far I've found some packages to get the list of all clientID within a single MCC but I've found no example to get the list of all MCC within an Adwords account.
Do someone have experience on this topic?
So far I've tried:
library(RAdwordsPlus)
library(RAdwords)
google_auth <- doAuth()
api_version <- "v201809"
customer_id <- "MCC-MAIN-CODE"
request <- RAdwordsPlus::managed.customer.request(fields = c("Name", "CustomerId"))
r <- RAdwordsPlus::get.service(request = request,
cid = customer_id,
auth = google_auth,
api.version = api_version,
user.agent = "r-adwordsplus-test",
verbose = FALSE,
raw = FALSE,
partial.failure = FALSE)
Code ended up with this error:
Warning message:
In parser(response) : x is not a valid managed.customer
My Account structure is something like:
Main MCC
Customer 1 (client_id_1)
Camp_#1
Camp_#2
Customer 2 (client_id_2)
Camp_#1
Camp_#2
Customer 3 (client_id_3)
Camp_#1
Camp_#2
As stated, my goal will be to get all the client_id in order to gathering data for every Customer in the account
Thanks.
Looks like JB already answered your question in his docs at:
https://jburkhardt.github.io/RAdwords/faq/#list-account-ids
List account IDs
How to list all AdWords account IDs which are in my MCC?
We would love to implement this feature! Unfortunately the Adwords API
reporting service does not allow to query the account information on
client center level.
However the good is, you only need to authenticate once in order to
access all accounts within your MCC. Best practice is to create a
vector containing the account IDs and loop over the vector.
Example of that would be something like:
load('.google.auth.RData')
adwords_accounts <- c(
"495-862-1111",
"613-408-2222",
"564-802-3333",
"902-758-4444",
"536-035-5555",
"708-304-6666",
"429-737-7777",
"532-474-8888")
#
account_performance <- statement(select= c('Date','AccountDescriptiveName','Cost','Clicks'),
report="ACCOUNT_PERFORMANCE_REPORT",
start="2019-01-01",
end=as.character(Sys.Date()))
#
list_of_data <- lapply(adwords_accounts, function(x) getData(clientCustomerId = x, google_auth = google_auth, statement = account_performance))
adwords_data <- do.call(rbind,list_of_data)

googleAnalyticsR 403 user permission

I am cyciling through 50 GA accounts and there is one with user permission that I do not have access to the error I am getting is...
Error : API returned: User does not have sufficient permissions for this profile.
Error in error_check(out) :
API returned: User does not have sufficient permissions for this profile.
I new to R and am wondering if I can write a trycatch method to skip this account and continue the method. At the moment the process stops with a Teequest Status Code:403/
library(googleAnalyticsR)
library(tidyverse)
#settings
start_date <- as.character(Sys.Date()-31)
end_date <- as.character(Sys.Date()-1)
metrics <- c("sessions", "pageviews")
dimensions <- "year"
#Authorize Google Analytics R- this will open a webpage
#You must be logged into your Google Analytics account on your web browser
ga_auth()
account_summary <- ga_account_list()
#Add the start and end date to the date frame, as well as some columns to use to populate the metrics
account_summary$start_date <- start_date
account_summary$end_date <- end_date
# cycle through the list of views, pull the data, and add it to the
#account_summary
for (i in 1:nrow(account_summary)){
view_id <- account_summary$viewId[i]
ga_data <- google_analytics_4(viewId = view_id,
date_range = c(start_date,end_date),
metrics = metrics,
dimensions = dimensions)
# This query might return multiple rows (if it spans a year boundary), so
#collapse and clean up
ga_data <- summarise(ga_data,
sessions = sum(sessions),
pageviews = sum(pageviews))
#add the totals to the account summary
account_summary$sessions[i] <- ga_data$sessions
account_summary$pageviews[i] <- ga_data$pageviews
}
# Make a more compact set of data
clean_summary <- select(account_summary,
Account = accountName,
Property + webPropertyName,
View = viewName,
Type = type,
Level = level,
'Start Date' = start_date,
'End Date' = end_date,
Sessions = sessions,
Pageviews = pageviews)
write.csv (clean_summary, "summary_results.csv", row.names = FALSE)
You can use tryCatch like this:
...
ga_data <- tryCatch(google_analytics_4(viewId = view_id,
date_range = c(start_date,end_date),
metrics = metrics,
dimensions = dimensions),
error = function(ex){
warning(ex)
NULL
})
...
The example turns the error into a warning instead, and returns a NULL

xml to R dataframe, multiple layers of children

I was trying to convert an xml into r df using XML package. Was able to get a df successfully, but whenever there were grandchildren under a child, values of grandchildren was merged into one column.
Here is how the xml looks like:
<user>
<created-at type="datetime">2012-12-20T18:32:20+00:00</created-at>
<details></details>
<is-active type="boolean">true</is-active>
<last-login type="datetime">2017-06-22T16:52:11+01:00</last-login>
<time-zone>Pacific Time (US & Canada)</time-zone>
<updated-at type="datetime">2017-06-22T21:00:47+01:00</updated-at>
<is-verified type="boolean">true</is-verified>
<groups type="array">
<group>
<created-at type="datetime">2015-02-09T09:34:41+00:00</created-at>
<id type="integer">23215935</id>
<is-active type="boolean">true</is-active>
<name>Product Managers</name>
<updated-at type="datetime">2015-02-09T09:34:41+00:00</updated-at>
</group>
</groups>
</user>
The code I used were:
users_xml = xmlTreeParse("users.xml")
top_users = xmlRoot(users_xml)
users = xmlSApply(top_users, function(x) xmlSApply(x, xmlValue))
The result I got had all the elements listed fine besides it combined everything under "groups" into one column. Is there anyway I can make each element under "group" a separate column in the final dataframe?
I also tried
nodes=getNodeSet(top_users, "//groups[#group]")
and
nodes=getNodeSet(top_users, "//groups/group[#group]")
and
nodes=getNodeSet(top_users, "//.groups/group[#group]")
and switched "top_users" to "user_xml", but each time got error message:
Error: 1: Input is not proper UTF-8, indicate encoding !
Bytes: 0xC2 0x3C 0x2F 0x6E
Then tried
data.frame(t(xpathSApply(xmlRoot(xmlTreeParse("users.xml", useInternalNodes = T)),
"//user", function(y) xmlSApply(y, xmlValue))))
Which gave me the exact same thing as the first solution.
And finally, I tried
data.frame(t(xpathSApply(xmlRoot(xmlTreeParse("users.xml", useInternalNodes = T)),
"//user/groups/group", function(y) xmlSApply(y, xmlValue))))
Which did give me a dataframe but only with elements in "group", and there is no way I can map it back to the first table I got that has all elements in "user".
Consider column binding with xmlToDataFrame() of user children and groups children:
userdf <- xmlToDataFrame(nodes=getNodeSet(doc, "/user"))
groupdf <- xmlToDataFrame(nodes=getNodeSet(doc, "/user/groups/group"))
df <- transform(cbind(userdf, groupdf), groups = NULL) # REMOVE groups COL
df
# created.at details is.active last.login time.zone
# 1 2012-12-20T18:32:20+00:00 true 2017-06-22T16:52:11+01:00 Pacific Time (US & Canada)
# updated.at is.verified created.at.1 id is.active.1 name
# 1 2017-06-22T21:00:47+01:00 true 2015-02-09T09:34:41+00:00 23215935 true Product Managers
# updated.at.1
# 1 2015-02-09T09:34:41+00:00

Handling internet connection R

I`m trying to download several stocks from google, but every time the connection stops, R stops the loop. How can I handle this problem?
stocks <- c(
'MSFT',
'GOOG',
...
)
for (symbol in stocks)
{
stock_price <- getSymbols(symbol,src='google', from=startDate,to=endDate,auto.assign = FALSE)
prices[,j] <- stock_price[,1]
j <- j + 1
}
From the R manual "quantmod.pdf:
If auto.assign=FALSE or env=NULL (as of 0.4-0) the data will be returnedfrom the call, and will require the user to assign the results himself.Note that only one symbol at a time may be requested when auto assignment is disabled.
You are trying to request more than one ticket symbol at a time with the auto.assign parameter set to false and this is not allowed. However, you should be able to obtain all your symbols at once by adapting the following code:
data <- new.env()
getSymbols.extra(stocks, src = 'google', from = startDate, to = endDate, env = data, auto.assign = T)
plot(data$MSFT)
Pay careful attention to the R manual for getSymbols
"Data is fetched through one of the available getSymbols methods and saved in the env specified - the .GloblEnv by default.

Resources