My organization uses Pheedloop and I'm trying to build a dynamic solution for access its data.
So, how do I access the Pheedloop API using R? Specifically, how do I accurately submit my API credentials to Pheedloop and download data. I also need the final data to be in a dataframe format.
Use the RCurl package along with jsonlite. Importantly, you need to send a header with your request.
orgcode<-'yourcode'
myapikey<-'yourapikey'
mysecret<-'yourapisecret'
library(RCurl)
library(jsonlite)
# AUTHENTICATION
authen<-paste0("https://api.pheedloop.com/api/v3/organization/",orgcode,"/validateauth/") # create a link with parameters
RCurl::getURL(
authen,
httpheader = c('X-API-KEY' = myapikey, 'X-API-SECRET' = mysecret), # include key and secret in the header like this
verbose = TRUE)
# LIST EVENTS
events<-paste0("https://api.pheedloop.com/api/v3/organization/",orgcode, " events/")
# the result will be a JSON file
cscEvents<-getURL(
events,
httpheader = c('X-API-KEY' = myapikey, 'X-API-SECRET' = mysecret),
verbose = FALSE)
cscEvents<-fromJSON(cscEvents ); # using jsonlite package to parse json format
cscEventsResults<-cscEvents$results # accessing the results table
table(cscEventsResults$event_name) # examine
Related
I am trying to integrate the performance testing of certain websites using GTmetrix. With the API, I am able to run the test and pull the results using the SEO connector tool in Microsoft Excel. However, it uses the xml with older version of API, and some new tests are not available in this. The latest version is 2.0
The link for the xml is here: GTmetrix XML for API 0.1.
I tried using the libraries httr and jsonlite. But, I don't know how authenticate with API, run the test and extract the results.
The documentation for API is available at API Documentation.
library(httr)
library(jsonlite)
url <- "https://www.berkeley.edu" # URL to be tested
location <- 1 # testing Location
browser <- 3 # Browser to be used for testing
res <- GET("https://gtmetrix.com/api/gtmetrix-openapi-v2.0.json")
data <- fromJSON(rawToChar(res$content))
Update 2021-11-08:
I whipped up a small library to talk to GTmetrix via R. There's some basic sanity checking baked in, but obviously this is still work in progress and there are (potentially critical) bugs. Feel free to check it out, though. Would love some feedback.
# Install and load library.
devtools::install_github("RomanAbashin/rgtmx")
library(rgtmx)
Update 2021-11-12: It's available on CRAN now. :-)
# Install and load library.
install_packages("rgtmx")
library(rgtmx)
Start test (and get results)
# Minimal example #1.
# Returns the final report after checking test status roughly every 3 seconds.
result <- start_test("google.com", "[API_KEY]")
This will start a test and wait for the report to be generated, returning the result as data.frame. Optionally, you can just simply return the test ID and other meta data via the parameter wait_for_completion = FALSE.
# Minimal example #2.
# Returns just the test ID and some meta data.
result <- start_test("google.com", "[API_KEY]", wait_for_completion = FALSE)
Other optional parameters: location,
browser,
report,
retention,
httpauth_username,
httpauth_password,
adblock,
cookies,
video,
stop_onload,
throttle,
allow_url,
block_url,
dns,
simulate_device,
user_agent,
browser_width,
browser_height,
browser_dppx,
browser_rotate.
Show available browsers
show_available_browsers("[API_KEY]")
Show available locations
show_available_locations("[API_KEY]")
Get specific test
get_test("[TEST_ID]", "[API_KEY]")
Get specific report
get_report("[REPORT_ID]", "[API_KEY]")
Get all tests
get_all_tests("[API_KEY]")
Get account status
get_account_status("[API_KEY]")
Original answer:
Pretty straightforward, actually:
0. Set test parameters.
# Your api key from the GTmetrix console.
api_key <- "[Key]"
# All attributes except URL are optional, and the availability
# of certain options may depend on the tier of your account.
# URL to test.
url <- "https://www.worldwildlife.org/"
# Testing location ID.
location_id <- 1
# Browser ID.
browser_id <- 3
1. Start a test
res_test_start <- httr::POST(
url = "https://gtmetrix.com/api/2.0/tests",
httr::authenticate(api_key, ""),
httr::content_type("application/vnd.api+json"),
body = jsonlite::toJSON(
list(
"data" = list(
"type" = "test",
"attributes" = list(
"url" = url,
# Optional attributes go here.
"location" = location_id,
"browser" = browser_id
)
)
),
auto_unbox = TRUE
),
encode = "raw"
)
2. Get test ID
test_id <- jsonlite::fromJSON(rawToChar(res_test_start$content))$data$id
3. Get report ID
# Wait a bit, as generating the report can take some time.
res_test_status <- httr::GET(
url = paste0("https://gtmetrix.com/api/2.0/tests/", test_id),
httr::authenticate(api_key, ""),
httr::content_type("application/vnd.api+json")
)
# If this returns the test ID, the report is not ready, yet.
report_id <- jsonlite::fromJSON(rawToChar(res_test_status$content))$data$id
4. Get report
res_report <- httr::GET(
url = paste0("https://gtmetrix.com/api/2.0/reports/", report_id),
httr::authenticate(api_key, ""),
httr::content_type("application/vnd.api+json")
)
# The report is a nested list with the results as you know them from GTmetrix.
report <- jsonlite::fromJSON(rawToChar(res_report$content))$data
I'm kinda tempted to build something for this as there seems to be no R library for it...
I'm receiving the data I'm requesting but don't understand how to sufficiently extract the data. Here is the POST request:
library(httr)
url <- "http://tools-cluster-interface.iedb.org/tools_api/mhci/"
body <- list(method="recommended", sequence_text="SLYNTVATLYCVHQRIDV", allele="HLA-A*01:01,HLA-A*02:01", length="8,9")
data <- httr::POST(url, body = body,encode = "form", verbose())
If I print the data with:
data
..it shows the request details followed by a nicely formatted table. However if I try to extract with:
httr::content(data, "text")
This returns a single string with all the values of the original table. The output looks delimited by "\" but I couldn't str_replace or tease it out properly.
I'm new to requests using R (and httr) and assume it's an option I'm missing with httr. Any advice?
API details here: http://tools.iedb.org/main/tools-api/
The best way to do this is to specify the MIME type:
content(data, type = 'text/tab-separated-values')
I'm trying to fetch data from the Google Plus API but I only know how to search if I know the user_id.
Here's how I get the JSON using RCurl library:
data <- getURL(paste0("https://www.googleapis.com/plus/v1/people/",
user_id,"/activities/public?maxResults=100&key=", api_key),
ssl.verifypeer = FALSE)
I have tried formatting the URL like the documentation on google
like so:
data <- getURL(paste0("https://www.googleapis.com/plus/v1/activities/",
keyword,"?key=",api_key),ssl.verifypeer = FALSE)
but it doesn't work.
Is it even possible to search using a keyword from R or not? As R isn't in the supported programming languages for the API according to this link
I figured out how to make it work.
The GET request should be formatted as:
data <- getURL(paste0("https://www.googleapis.com/plus/v1/activities?key=",api_key,"&query=",search_string),ssl.verifypeer = FALSE)
Using R Server, I want to simply read raw text (like readLines in base) from an Azure Data Lake. I can connect and get data like so:
library(RevoScaleR)
rxSetComputeContext("local")
oAuth <- rxOAuthParameters(params)
hdFS <- RxHdfsFileSystem(params)
file1 <- RxTextData("/path/to/file.txt", fileSystem = hdFS)
RxTextData doesn't actually go and get the data once that line is executed, it works as more of a symbolic link. When you run something like:
rxSummary(~. , data=file1)
Then the data is retrieved from the data lake. However, it is always read in and treated as a delimited file. I want to either:
Download the file and store it locally with R code (preferably not).
Use some sort of readLines equivalent to get the data from but read it in 'raw' so that I can do my own data quality checks.
Does this functionality exist yet? If so, how is this done?
EDIT: I have also tried:
returnDataFrame = FALSE
inside RxTextData. This returns a list. But as I've stated, the data isn't read in immediately from the data lake until I run something like rxSummary, which then attempts to read it as a regular file.
Context: I have a "bad" CSV file containing line feeds inside double quotes. This causes RxTextData to break. However, my script detects these occurrences and fixes them accordingly. Therefore, I don't want RevoScaleR to read in the data and try and interpret the delimiters.
I found a method of doing this by calling the Azure Data Lake Store REST API (adapted from a demo from Hadley Wickham's httr package on GitHub):
library(httpuv)
library(httr)
# 1. Insert the app name ----
app_name <- 'Any name'
# 2. Insert the client Id ----
client_id <- 'clientId'
# 3. API resource URI ----
resource_uri <- 'https://management.core.windows.net/'
# 4. Obtain OAuth2 endpoint settings for azure. ----
azure_endpoint <- oauth_endpoint(
authorize = "https://login.windows.net/<tenandId>/oauth2/authorize",
access = "https://login.windows.net/<tenandId>/oauth2/token"
)
# 5. Create the app instance ----
myapp <- oauth_app(
appname = app_name,
key = client_id,
secret = NULL
)
# 6. Get the token ----
mytoken <- oauth2.0_token(
azure_endpoint,
myapp,
user_params = list(resource = resource_uri),
use_oob = FALSE,
as_header = TRUE,
cache = FALSE
)
# 7. Get the file. --------------------------------------------------------
test <- content(GET(
url = "https://accountName.azuredatalakestore.net/webhdfs/v1/<PATH>?op=OPEN",
add_headers(
Authorization = paste("Bearer", mytoken$credentials$access_token),
`Content-Type` = "application/json"
)
)) ## Returns as a binary body.
df <- fread(readBin(test, "character")) ## use readBin to convert to text.
You can do it with ScaleR functions like so. Set the delimiter to a character that doesn't occur in the data, and ignore column names. This will create a data frame containing a single character column which you can manipulate as necessary.
# assuming that ASCII 0xff/255 won't occur
src <- RxTextData("file", fileSystem="hdfs", delimiter="\x255", firstRowIsColNames=FALSE)
dat <- rxDataStep(src)
Although given that Azure Data Lake is really meant for storing big datasets, and this one seems to be small enough to fit in memory, I wonder why you couldn't just copy it to your local disk....
I would like to retrieve a list of tweets from Twitter for a given hashtag using package RJSONIO in R. I think I am pretty close to the solution, but I seem to miss one step.
My code reads as follows (in this example, I use #NBA as a hashtag):
library(httr)
library(RJSONIO)
# 1. Find OAuth settings for twitter:
# https://dev.twitter.com/docs/auth/oauth
oauth_endpoints("twitter")
# Replace key and secret below
myapp <- oauth_app("twitter",
key = "XXXXXXXXXXXXXXX",
secret = "YYYYYYYYYYYYYYYYY"
)
# 3. Get OAuth credentials
twitter_token <- oauth1.0_token(oauth_endpoints("twitter"), myapp)
# 4. Use API
req=GET("https://api.twitter.com/1.1/search/tweets.json?q=%23NBA&src=typd",
config(token = twitter_token))
req <- content(req, as = "text")
response=fromJSON(req)
How can I get the list of tweets from object 'response'?
Eventually, I would like to get something like:
searchTwitter("#NBA", n=5000, lang="en")
Thanks a lot in advance!
The response object should be a list of length two: statuses and metadata. So, for example, to get the text of the first tweet, try:
response$statuses[[1]]$text
However, there are a couple of R packages designed to make just this kind of thing easier: Try streamR for the streaming API, and twitteR for the REST API. The latter has a searchTwitter function exactly as you describe.