How do I pass an array into the query of httr?
The request url should look like that:
https://www.example.com/xyz?type=3&type=5
My current code looks like that:
POST(url,
query = data.frame("something" = "somethingElse", type = ),
add_headers(.headers = c("token" = token),
encode = "json")
How do I add those types from the url example to my R example?
The default encoding for httr doesn't like to use the same name multiple times, but it is possible to separate your values into lists which have duplicate names. Here's a helper function i've used that can help
flattenbody <- function(x) {
# A form/query can only have one value per name, so take
# any values that contain vectors length >1 and
# split them up
# list(x=1:2, y="a") becomes list(x=1, x=2, y="a")
if (all(lengths(x)<=1)) return(x);
do.call("c", mapply(function(name, val) {
if (length(val)==1 || any(c("form_file", "form_data") %in% class(val))) {
x <- list(val)
names(x) <- name
x
} else {
x <- as.list(val)
names(x) <- rep(name, length(val))
x
}
}, names(x), x, USE.NAMES = FALSE, SIMPLIFY = FALSE))
}
Then you could use it with something like
POST(url,
query = flattenbody(list(something="somethingElse", type = c(3, 5))),
add_headers(.headers = c("token" = token)),
encode = "json"
)
Related
I have a data frame with observations on YouTube video_ids. When used in an API call, these ids allow me to fetch data on certain videos that I use to enrich my dataset.
First I created a list of unique video_ids with the below script. This returns a large list of 6350 unique elements.
video_ids <- list();
index <- 1
for(i in unique(df$video_id)){
video_ids[[index]] <- list(
video_id = i
)
index <- index + 1
}
The API documentation asks for a comma seperated list of video ids. I did that by using unlist(video_ids) which returns a large vector. I cannot use this vector in the API call, because it is way too long.
The maximum amount of ids I can process in one API call is 50.
library(httr)
api_key = "xxxx"
process_ids = unlist(video_ids[1:50]) #pass the first 50 elements of the video_ids list
url <- modify_url("https://www.googleapis.com/youtube/v3/videos",
query = list(
"part" = "snippet",
"id" = paste(process_ids, collapse=","),
"key" = api_key)
)
output <- content(GET(url), as = "parsed", type = "application/json")
What is the best approach for this in R? Can I loop through my list of 6350 elements by 50 items each loop, removing these items from the list when the loop completes?
My current script below loops through each video id in the list and fetches the data I need from the output of the API response. This works, but is very slow and requires a lot of loops / API calls. (6350 loops). It can't be the most effient way to approach this.
result <- list();
index <- 1
for (id in video_ids) {
api_key = "xxxx"
url <- modify_url("https://www.googleapis.com/youtube/v3/videos",
query = list(
"part" = "snippet",
"id" = paste(id, collapse=","),
"key" = api_key)
)
output <- content(GET(url), as = "parsed", type = "application/json")
#Adds what I need from the output to a list called result
for(t in output$items){
result[[index]] <- list(
video_id = t$id,
channel_id = t$snippet$channelId
)
}
index <- index + 1
}
You can try the following :
Split the video id's every 50 values and pass it to the API.
vec = unlist(video_ids)
result <- lapply(split(vec, ceiling(seq_along(vec)/50)), function(x) {
url <- modify_url("https://www.googleapis.com/youtube/v3/videos",
query = list(
"part" = "snippet",
"id" = paste(x, collapse=","),
"key" = api_key))
content(GET(url), as = "parsed", type = "application/json")
})
I want to call an Api that gives me all my shop orders. I have a total of 86.000 orders where the ID of the first order is 2 and the ID of the most recent order is 250.000. Orders obviously are not count consecutive, due to some reason i dont know yet, but doesnt matter.
I started a simple script with a for loop where the ID gets updated in every loop, like this:
library(jsonlite)
library(httr)
user = "my_name"
token = "xyz"
y = 0
urls = rep("api.com/orders/", 250000)
for(i in urls){
y = y + 1
url = paste0(i, y)
a = httr::GET(url, authenticate(user, token))
a_content = httr:content(a, as = "text", encoding = "UTF-8")
a_json = jsonlite::fromJSON(a_content, flatten = T)
...
}
Problem here is, whenever there is an ID with no order, the loop stops with "{\"success\":false,\"message\":\"Order by id 1 not found\"}" So i somehow have to expand the code with some if else statements, like 'if the id does not match an order proceed to the next order'. Also i want to write all orders into a new list.
Any help apprichiated.
Maybe tryCatch can solve the problem. This url_exists function posted by user #hrbrmstr is called in the loop below.
Without testing, I would do something along the lines of:
base_url <- "api.com/orders/"
last_order_number <- 250000
for(i in seq.int(last_order_number)){
url = paste0(base_url, i)
if(url_exists(url)){
a <- httr::GET(url, authenticate(user, token))
a_content <- httr:content(a, as = "text", encoding = "UTF-8")
a_json <- jsonlite::fromJSON(a_content, flatten = T)
}
}
I am trying to generate a dataset of comments from a Youtube video and am having trouble looping over the pageToken using the Google API. Below is a snippet of code. Why doesn't the 'while' loop work?
base_url <- "https://www.googleapis.com/youtube/v3/commentThreads/"
data = "list"
api_opts <- list(
part = "snippet",
maxResults = 100,
textFormat = "plainText",
videoId = "N708P-A45D0", # This is an example of a video id
key = "google developer key goes here",
fields = "items,nextPageToken",
orderBy = "published")
init_results <- httr::content(httr::GET(base_url, query = api_opts))
data <- init_results$items
api_opts$pageToken <- init_results$nextPageToken
api_opts$pageToken <- gsub("\\=", "", init_results$nextPageToken)
print(api_opts$pageToken)
while (api_opts$pageToken != "") {
print(api_opts$pageToken)
next_results <- httr::content(httr::GET(base_url, query = api_opts))
data <- c(data, next_results$items)
api_opts$pageToken <- gsub("\\=", "", next_results$nextPageToken)
}
organize_data = function(){
sub_data <- lapply(data, function(x) {
data.frame(
Comment = x$snippet$topLevelComment$snippet$textDisplay,
User = x$snippet$topLevelComment$snippet$authorDisplayName,
ReplyCount = x$snippet$totalReplyCount,
LikeCount = x$snippet$topLevelComment$snippet$likeCount,
PublishTime = x$snippet$topLevelComment$snippet$publishedAt,
CommentId = x$snippet$topLevelComment$id,
stringsAsFactors=FALSE)
})
}
sample <- organize_data()
L <- length(sample)
sample <- data.frame(matrix(unlist(sample), nrow=L, byrow=T))
colnames(sample) <- c("Comment", "User", "ReplyCount", "LikeCount", "PublishTime", "CommentId")
head(sample)
Thanks for looking, in case anyone else has this problem in the future, below is what I did to fix this problem. I still can't get the replies to the replies.
####
# NEW TRY
# Note: according to YouTube "YouTube currently supports replies only for top-level comments. However, replies to replies may be supported in the future."
####
rm(list=ls())
data = "list"
# Initialize
init_results <- httr::content(httr::GET("https://www.googleapis.com/youtube/v3/commentThreads?part=snippet%2C+replies&maxResults=100&textFormat=plainText&videoId=N708P-A45D0&fields=items%2CnextPageToken&key=[my google developer key]"))
data <- init_results$items
init_results$nextPageToken
print(init_results$nextPageToken)
# Begin loop
while (init_results$nextPageToken != ""){
# Make the page token URL encoded
api_opts_pageToken <- gsub("=", "%3D", init_results$nextPageToken)
# Write the call with the updated page token
get_call <- gsub("api_pageToken", api_opts_pageToken, "https://www.googleapis.com/youtube/v3/commentThreads?part=snippet%2C+replies&maxResults=100&pageToken=api_pageToken&textFormat=plainText&videoId=N708P-A45D0&fields=items%2CnextPageToken&key==[my google developer key]")
# Pull out the data from this page token call
next_results <- httr::content(httr::GET(get_call))
# Update the datafile
data <- c(data,next_results$items)
# Update the page token
print(next_results$nextPageToken)
init_results$nextPageToken <- next_results$nextPageToken
}
organize_data = function(){
sub_data <- lapply(data, function(x) {
data.frame(
Comment = x$snippet$topLevelComment$snippet$textDisplay,
User = x$snippet$topLevelComment$snippet$authorDisplayName,
ReplyCount = x$snippet$totalReplyCount,
LikeCount = x$snippet$topLevelComment$snippet$likeCount,
PublishTime = x$snippet$topLevelComment$snippet$publishedAt,
CommentId = x$snippet$topLevelComment$id,
stringsAsFactors=FALSE)
})
}
sample <- organize_data()
L <- length(sample)
sample <- data.frame(matrix(unlist(sample), nrow=L, byrow=T))
colnames(sample) <- c("Comment", "User", "ReplyCount", "LikeCount", "PublishTime", "CommentId")
head(sample)
dim(sample)
In R, I need to return two objects from a function:
myfunction()
{
a.data.frame <- read.csv(file = input.file, header = TRUE, sep = ",", dec = ".")
index.hash <- get_indices_function(colnames(a.data.frame))
alist <- list("a.data.frame" = a.data.frame, "index.hash" = index.hash)
return(alist)
}
But, the returned objects from myfunction all become list not data.frame and hash.
Any help would be appreciated.
You can only return one object from an R function; this is consistent with..pretty much every other language I've used. However, you'll note that the objects retain their original structure within the list - so alist[[1]] and alist[[2]] should be the data frame and hash respectively, and are structured as data frames and hashes. Once you've returned them from the function, you can split them out into unique objects if you want :).
You can use a structure.
return (structure(class = "myclass",
list(data = daza.frame,
type = anytype,
page.content = page.content.as.string.vector,
knitr = knitr)))
Than you can access your data with
values <- my function(...)
values$data
values$type
values$page.content
values$knitr
and so on.
A working example from my package:
sju.table.values <- function(tab, digits=2) {
if (class(tab)!="ftable") tab <- ftable(tab)
tab.cell <- round(100*prop.table(tab),digits)
tab.row <- round(100*prop.table(tab,1),digits)
tab.col <- round(100*prop.table(tab,2),digits)
tab.expected <- as.table(round(as.array(margin.table(tab,1)) %*% t(as.array(margin.table(tab,2))) / margin.table(tab)))
# -------------------------------------
# return results
# -------------------------------------
invisible (structure(class = "sjutablevalues",
list(cell = tab.cell,
row = tab.row,
col = tab.col,
expected = tab.expected)))
}
tab <- table(sample(1:2, 30, TRUE), sample(1:3, 30, TRUE))
# show expected values
sju.table.values(tab)$expected
# show cell percentages
sju.table.values(tab)$cell
In R I have a function "getHbasedPortfolio". The body of the function is as follows:
body("getHbasedPortfolio")
{
className <- name
pf = list(name = name,
get = function(x) pf[[x]],
set = function(x,value) pf[[x]] <- value
)
pf$getCash = function(date) {
data <-data.frame(name=name,value="null")
return(data)
}
pf$setCash = function(cash, date) {
a <- insertCashTable(pf$name, cash, date)
return("success")
}
pf <- list2env(pf)
class(pf) <- name
return(pf)
}
I need to separate each lines with commas. So I wrote a code to do that.
The code I used is as follows:
body <-"";
for(i in 1:length(as.character(body("getHbasedPortfolio")))){
body <- paste(body,as.character(body("getHbasedPortfolio"))[i])
body <- paste(body,";")
}
Now I am getting the following output.
body
[1] " { ; className <- name ; pf = list(name = name, get = function(x) pf[[x]], set = function(x, value) pf[[x]] <- value) ; pf$getCash = function(date) {\n data <- data.frame(name = name, value = \"null\")\n return(data)\n} ; pf$setCash = function(cash, date) {\n a <- insertCashTable(pf$name, cash, date)\n return(\"success\")\n} ; pf <- list2env(pf) ; class(pf) <- name ; return(pf) ;"
The problem is that the statements under the pf$setCash and pf$getCash are not separated by commas.
How can I overcome this problem?
You can use something like this:
dat <- data.frame(capture.output(getHbasedPortfolio))
This will save your function line by a line in a data.frame. You can then save it in your data base.
EDIT
you can use dat to write line by line.
paste(data.frame(capture.output(getHbasedPortfolio))[,1],
collapse=',') ## I would use another separator here becuase comma is confusing