I'm using RGoogleAnalytics to retrieve mutiple dimensions data ,but every time I try to run ga.data <- ga$GetReportData(query)
then I got an error message :Error in fromJSON(api.response.json, method = "C") :
unexpected escaped character '\'' at pos 53
It's ok when I try other functions
How could I fix this?
I use the following code:
require("RGoogleAnalytics")
query <- QueryBuilder()
access_token <- query$authorize()
ga <- RGoogleAnalytics()
ga.profiles <- ga$GetProfileData(access_token)
profile <- ga.profiles$id[3]
startdate <- "2013-10-01"
enddate <- "2013-12-31"
dimension <- "ga:date,ga:source,ga:medium,ga:keyword,ga:city,ga:operatingSystem,ga:landingPagePath"
metric <- "ga:visits,ga:goal1Completions,ga:goal3Completions"
sort <- "ga:visits"
maxresults <- 500000
query$Init(start.date = startdate,
end.date = enddate,
dimensions = dimension,
metrics = metric,
max.results = maxresults,
table.id = paste("ga:",profile,sep="",collapse=","),
access_token=access_token)
ga.data <- ga$GetReportData(query)
I had some trouble with this too,figured out a way.
Step 1: Install Packages
# lubridate
install.packages("lubridate")
# httr
install.packages("httr")
#RGoogleAnalytics
Use this link to download this particular version of RGoogleAnalytics
http://cran.r-project.org/web/packages/RGoogleAnalytics/index.html
Step 2: Create Client ID and Secret ID
Navigate to Google Developers Console.
(https://console.developers.google.com/project)
Create a New Project and Open it.
Navigate to APIs and ensure that the Analytics API is turned On for your project.
Navigate to Credentials and create a New Client ID.
Select Application Type – Installed Application.
Once your Client ID and Client Secret are created, copy them to your R Script.
client.id <- "xxxxxxxxxxxxxxxxxxxxxxxxx"
client.secret <- "xxxxxxxxxxxxxxx"
token <- Auth(client.id,client.secret)
Save the token object for future sessions
save(token,file="./token_file")
In future sessions, you need not generate the Access Token every time. Assumming that you have saved it to a file,
it can be loaded via the following snippet -
load("./token_file")
Validate and refresh the token
ValidateToken(token)
Step 3: Build required query
query.list <- Init( start.date = "2014-08-01",
end.date = "2014-09-01",
dimensions = "ga:sourceMedium",
metrics = "ga:sessions,ga:transactions",
max.results = 10000,
sort = "-ga:transactions",
table.id = "ga:0000000")
Create the Query Builder object so that the query parameters are validated
ga.query <- QueryBuilder(query.list)
Extract the data and store it in a data-frame
ga.data <- GetReportData(ga.query, token,paginate_query = FALSE)
Handy Links
Common Errors: developers.google.com/analytics/devguides/reporting/core/v3/coreErrors#standard_errors
Query Explorer:
ga-dev-tools.appspot.com/query-explorer/?csw=1
Dimensions and metrics:
developers.google.com/analytics/devguides/reporting/core/dimsmets
It seems that this error appears when the Rjson library isn't able to parse Google Analytics JSON Feed properly. Please try out the recently released and updated version of the RGoogleAnalytics library from CRAN.
Related
My organization uses Pheedloop and I'm trying to build a dynamic solution for access its data.
So, how do I access the Pheedloop API using R? Specifically, how do I accurately submit my API credentials to Pheedloop and download data. I also need the final data to be in a dataframe format.
Use the RCurl package along with jsonlite. Importantly, you need to send a header with your request.
orgcode<-'yourcode'
myapikey<-'yourapikey'
mysecret<-'yourapisecret'
library(RCurl)
library(jsonlite)
# AUTHENTICATION
authen<-paste0("https://api.pheedloop.com/api/v3/organization/",orgcode,"/validateauth/") # create a link with parameters
RCurl::getURL(
authen,
httpheader = c('X-API-KEY' = myapikey, 'X-API-SECRET' = mysecret), # include key and secret in the header like this
verbose = TRUE)
# LIST EVENTS
events<-paste0("https://api.pheedloop.com/api/v3/organization/",orgcode, " events/")
# the result will be a JSON file
cscEvents<-getURL(
events,
httpheader = c('X-API-KEY' = myapikey, 'X-API-SECRET' = mysecret),
verbose = FALSE)
cscEvents<-fromJSON(cscEvents ); # using jsonlite package to parse json format
cscEventsResults<-cscEvents$results # accessing the results table
table(cscEventsResults$event_name) # examine
I am trying to integrate the performance testing of certain websites using GTmetrix. With the API, I am able to run the test and pull the results using the SEO connector tool in Microsoft Excel. However, it uses the xml with older version of API, and some new tests are not available in this. The latest version is 2.0
The link for the xml is here: GTmetrix XML for API 0.1.
I tried using the libraries httr and jsonlite. But, I don't know how authenticate with API, run the test and extract the results.
The documentation for API is available at API Documentation.
library(httr)
library(jsonlite)
url <- "https://www.berkeley.edu" # URL to be tested
location <- 1 # testing Location
browser <- 3 # Browser to be used for testing
res <- GET("https://gtmetrix.com/api/gtmetrix-openapi-v2.0.json")
data <- fromJSON(rawToChar(res$content))
Update 2021-11-08:
I whipped up a small library to talk to GTmetrix via R. There's some basic sanity checking baked in, but obviously this is still work in progress and there are (potentially critical) bugs. Feel free to check it out, though. Would love some feedback.
# Install and load library.
devtools::install_github("RomanAbashin/rgtmx")
library(rgtmx)
Update 2021-11-12: It's available on CRAN now. :-)
# Install and load library.
install_packages("rgtmx")
library(rgtmx)
Start test (and get results)
# Minimal example #1.
# Returns the final report after checking test status roughly every 3 seconds.
result <- start_test("google.com", "[API_KEY]")
This will start a test and wait for the report to be generated, returning the result as data.frame. Optionally, you can just simply return the test ID and other meta data via the parameter wait_for_completion = FALSE.
# Minimal example #2.
# Returns just the test ID and some meta data.
result <- start_test("google.com", "[API_KEY]", wait_for_completion = FALSE)
Other optional parameters: location,
browser,
report,
retention,
httpauth_username,
httpauth_password,
adblock,
cookies,
video,
stop_onload,
throttle,
allow_url,
block_url,
dns,
simulate_device,
user_agent,
browser_width,
browser_height,
browser_dppx,
browser_rotate.
Show available browsers
show_available_browsers("[API_KEY]")
Show available locations
show_available_locations("[API_KEY]")
Get specific test
get_test("[TEST_ID]", "[API_KEY]")
Get specific report
get_report("[REPORT_ID]", "[API_KEY]")
Get all tests
get_all_tests("[API_KEY]")
Get account status
get_account_status("[API_KEY]")
Original answer:
Pretty straightforward, actually:
0. Set test parameters.
# Your api key from the GTmetrix console.
api_key <- "[Key]"
# All attributes except URL are optional, and the availability
# of certain options may depend on the tier of your account.
# URL to test.
url <- "https://www.worldwildlife.org/"
# Testing location ID.
location_id <- 1
# Browser ID.
browser_id <- 3
1. Start a test
res_test_start <- httr::POST(
url = "https://gtmetrix.com/api/2.0/tests",
httr::authenticate(api_key, ""),
httr::content_type("application/vnd.api+json"),
body = jsonlite::toJSON(
list(
"data" = list(
"type" = "test",
"attributes" = list(
"url" = url,
# Optional attributes go here.
"location" = location_id,
"browser" = browser_id
)
)
),
auto_unbox = TRUE
),
encode = "raw"
)
2. Get test ID
test_id <- jsonlite::fromJSON(rawToChar(res_test_start$content))$data$id
3. Get report ID
# Wait a bit, as generating the report can take some time.
res_test_status <- httr::GET(
url = paste0("https://gtmetrix.com/api/2.0/tests/", test_id),
httr::authenticate(api_key, ""),
httr::content_type("application/vnd.api+json")
)
# If this returns the test ID, the report is not ready, yet.
report_id <- jsonlite::fromJSON(rawToChar(res_test_status$content))$data$id
4. Get report
res_report <- httr::GET(
url = paste0("https://gtmetrix.com/api/2.0/reports/", report_id),
httr::authenticate(api_key, ""),
httr::content_type("application/vnd.api+json")
)
# The report is a nested list with the results as you know them from GTmetrix.
report <- jsonlite::fromJSON(rawToChar(res_report$content))$data
I'm kinda tempted to build something for this as there seems to be no R library for it...
How do I fix an error like this:
library(twitteR)
library(ROAuth)
customer_key<-"-aaaaaa"
customer_secret<-"aaaaaaaa"
access_token<-"212123213-aaaaa"
access_secret<-"ccccccccccccc"
setup_twitter_oauth(customer_key,customer_secret,access_token,access_secret)
Tweets = searchTwitter("anxiety", n=100, lang = "en")
Error: invalid assignment for reference class field ‘language’, should be from class “character” or a subclass (was class “NULL”)
I User RStudio for run this source code, this is my first time using the twitter API. before I saw the tutorial and they succeeded, after I tried it and there was an error like that. thankyou
consumer_key <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
consumer_secret <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
access_token <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
access_secret <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
The steps below are the process of extracting Twitter with keyword #statistics and retrieving 10000 data
setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)
tw = searchTwitter('#statistics',
n = 10000,
retryOnRateLimit = 1617)
Make sure the token you use has been updated, and the possibility that the word you will scrape is "anxiety" does not exist in the last 7 days. Because scraping using the API will extract data 7 days back only. Suppose I extract data on the 8th, the data that will be pulled is data or sentences containing the word "anxiety" from the 2nd to the 8th.
I got quite a big set of URLs (> 8.500) I want to query the Google Analytics API with using R. I'm working with the googleAnalyticsR package. The problem is, that I am indeed able to loop through my set of urls, but the dataframe created only returns the total values for the host-id for each row (e.g. same values for each row).
Here's how far I got to this point:
library(googleAnalyticsR)
library(lubridate)
#Authorize with google
ga_auth()
ga.acc.list = ga_account_list()
my.id = 123456
#set time range
soty = floor_date(Sys.Date(), "year")
yesterday = floor_date(Sys.Date(), "day") - days(1)
#get some - in this case - random URLs
urls = c("example.com/de/", "example.com/us/", "example.com/en/")
urls = gsub("^example.com/", "ga:pagePath=~", urls)
df = data.frame()
#get data
for(i in urls){
ga.data = google_analytics_4(my.id,
date_range = c(soty, yesterday),
metrics = c("pageviews","avgTimeOnPage","entrances","bounceRate","exitRate"),
filters = urls[i])
df = rbind(df, ga.data)}
With the result of always receiving the total statistics for the my.id-domain in each row in the dataframe created (own data):
Output result
Anyone knows of a better way on how to tackle this or does google analytics simply prevent us from querying it in such a way?
What you're getting is normal: you only queried for metrics (c("pageviews","avgTimeOnPage","entrances","bounceRate","exitRate")), so you only get your metrics.
If you want to break down those metrics, you need to use dimensions:
https://developers.google.com/analytics/devguides/reporting/core/dimsmets
In your case you're interested in the ga:pagePath dimension, so something like this (untested code):
ga.data = google_analytics_4(my.id,
date_range = c(soty, yesterday),
dimensions=c("pagePath"),
metrics = c("pageviews","avgTimeOnPage","entrances","bounceRate","exitRate"),
filters = urls[i])
I advise you to use the Google Analytics Query Explorer until you get the desired results, then port it to R.
As for the number of results, you might be limited to 1K by default until you increase max_rows. There is a hard limit on 10K from the API, which means you then have to use pagination to retrieve more results if needed. I see some examples in the R documentation with max=99999999, I don't know if the R library automatically handles pagination beyond the first 10K or if they are unaware of the hard limit:
batch_gadata <- google_analytics(id = ga_id,
start="2014-08-01", end="2015-08-02",
metrics = c("sessions", "bounceRate"),
dimensions = c("source", "medium",
"landingPagePath",
"hour","minute"),
max=99999999)
I am new to Google analytics API.....I authenticated my application in R using code:
library(RGoogleAnalytics)
client.id <- "**************.apps.googleusercontent.com"
client.secret <- "**********************"
token <- Auth(client.id, client.secret)
save(token,file="./token_file")
ValidateToken(token)
I am figuring out what we need to enter in the below credentials:
query.list <- Init(start.date = "2011-11-28",
end.date = "2014-12-04",
dimensions = "ga:date,ga:pagePath,ga:hour,ga:medium",
metrics = "ga:sessions,ga:pageviews",
max.results = 10000, sort = "-ga:date", table.id = "ga:33093633")
Where can I find dimensions, metrics, sort, table.id
My eventual goal is to pull the text from "https://plus.google.com/105253676673287651806/posts"
Please do assist me in this....
Using Google Analytics and R may not suit what you want to do here, as the Google+ website won't be included in the data you collect.
You may want to look at using RVest which is a URL scraper tool for R. You could then get the information you need from any public URL into an R dataframe for you to analyse later.
Query Explorer:
https://ga-dev-tools.appspot.com/query-explorer/?csw=1
Dimensions and metrics:
https://developers.google.com/analytics/devguides/reporting/core/dimsmets