client error (400) using gsheet2tbl to get Google Sheet data - r

Since Rcurl no longer works for importing data into R from Google Sheets, I have been using gsheet2tbl.
This has been working well but today I was trying to download from a recently created Google Sheet and I received the following error:
url2<-"https://docs.google.com/spreadsheets/d/.../edit?usp=sharing"
d <- gsheet2tbl(url2, sheetid = 0)
Error in parse.response(r, parser, encoding = encoding) :
client error: (400) Bad Request
I double checked and everything is working just fine with my previously created Google Sheets.
Does anyone have any thoughts on how I can troubleshoot this issue?
Thanks very much,
Matt

There's a new package for reading from Google sheets... https://github.com/jennybc/googlesheets
I find that it's fantastic for this sort of work. Give it a shot...
devtools::install_github("jennybc/googlesheets")
# run this and it will ask for user authentication...
gs_ls()
gs_read(ws = "Your worksheet")

Related

Unable to pull JSON data from stats.nba.com

I've been having some difficulty getting data from stats.nba.com. I've been able to pull info pretty easily in the past, so wanted to see if you guys noticed any issues in my code or if you're running into the same problems.
I'm using rjson.
library(rjson)
url <- "https://stats.nba.com/stats/boxscoresummaryv2?GameID=0041800406"
a <- fromJSON(file = url)
When I run this, I get:
Error in file(con, "r") :
cannot open the connection to 'https://stats.nba.com/stats/boxscoresummaryv2?GameID=0041800406'
In addition: Warning message:
In file(con, "r") :
URL 'https://stats.nba.com/stats/boxscoresummaryv2?GameID=0041800406': status was 'Failure when receiving data from the peer'
I can, however, see the data in JSON format by following the request url. Anybody notice any mistakes I'm making?
The following code can read the json file into a list object.
library(jsonlite)
read_json("https://stats.nba.com/stats/boxscoresummaryv2?GameID=0041800406")
Not really a specific answer; think it was due to some issue with my firewall. Was able to get everything to work on a different network.

Previously working Python script using Google Reporting API v4 now returning 403

I wrote a python script to pull yesterday's data from Google Analytics. I'm using OAuth v2 with Google Reporting API v4. The backbone of the script is essentially the same as Google's sample version, except I included recursion to overcome the pagination limitation and am outputting the results to a CSV file.
Today it started to return a 403 error:
HttpError 403 when requesting https://analyticsreporting.googleapis.com/v4/reports:batchGet?alt=json returned "The caller does not have permission"
I did my due diligence by searching for a solution, but I already am using the ViewID and the computer that it's running on isn't signed into any other accounts (it exists to only run reports). I've also tried creating a new client_secrets.json file and verifying that I am within quotas, but the issue still persists. Nothing changed between yesterday and today, yet it refuses to run today.
EDIT
I'm using the same connection object, it's only instantiated once, the code is the exact same as on Google's website here -> Hello Analytics Reporting API v4 - Python
def initialize_analyticsreporting():
parser = argparse.ArgumentParser(
formatter_class=argparse.RawDescriptionHelpFormatter,
parents=[tools.argparser])
flags = parser.parse_args([])
flow = client.flow_from_clientsecrets(
CLIENT_SECRETS_PATH, scope=SCOPES,
message=tools.message_if_missing(CLIENT_SECRETS_PATH))
storage = file.Storage('analyticsreporting.dat')
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = tools.run_flow(flow, storage, flags)
http = credentials.authorize(http=httplib2.Http())
analytics = build('analytics', 'v4', http=http, discoveryServiceUrl=DISCOVERY_URI)
return analytics
I'm invoking it the batchGet method on each request like so...
response = analytics.reports().batchGet(body=loaded_request.get("request", {})).execute()
I've managed to work around this by using the exponential back-off in conjunction with recursion and try/except block, similar to the method recommended by Google here -> Error Responses
Like so:
try:
response = analytics.reports().batchGet(body=loaded_request.get("request", {})).execute()
except HttpError as err:
print(err)
time.sleep(2**expontential_backoff)
expontential_backoff += 1
if expontential_backoff < 5:
get_response(analytics, request, page_token, file_name, expontential_backoff)
else:
print("expontential_backoff:", expontential_backoff, "Exceeded")
return
If the error hits then when by when n > 1 it usually works just fine. I'm not terribly fond of this method, ideally I would like it to work correctly.
If there is no other solution, then hopefully this will help someone in the future.

Microsoft-Cognitive topic detection issue

I'm trying to use the topic detection API from Microsoft_cognitive through R and is not working ('mscstexta4r' package, I provided the key to it). My subscription is through my university and I'm using my laptop at home. Could that be the reason of the problem?
Specifically, the error I'm getting is 'Error: mscstexta4r: Not Found (HTTP 404). - { "statusCode": 404, "message": "Resource not found" }'
Please see attached screenshot of my R console. The stoplist that I'm using is a customized one and the data has 760 '.txt' documents of no more than 5KB each one (the total is 225KBScreenshot of R console)
Looks like you have a typo in the URL.
https://westus.api.cognitive.microsoft.com/text**a**/analytics/v2.0/
To submit a job the POST URL should be:
https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/topics
To query for the job status, it’s a GET request to:
https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/operations/{operationId}
See the Text Analytics API reference for more details about the parameters.

R - Accesing Google Big Query with R. Authetication failed

I am trying to access Google Big Query with R, using the 'assertthat' and 'bigrquery' packages, following these instructions:
http://thinktostart.com/using-google-bigquery-with-r/#comment-22450
http://www.lunametrics.com/blog/2014/06/25/google-analytics-data-mining-bigquery-r/
The issue comes at the authentication step, I get directed to a code in the webbrowser, and when I paste the code in the terminal the following error appears:
Enter authorization code:
####CODE GOES HERE#####
Error en function (type, msg, asError = TRUE) :
Could not resolve host: accounts.google.com
I think that one possible issue is that we are behind a corporate firewall. While we do have access to the internet and I can install R packages, if I ping google.com from the terminal, I get an error. But I would like to know if any of you have found a solution to this kind of problem.
Thank you very much for reading this post. Any help is appreciated.
I found a solution to the issue. It was related to the corporate proxies. If I use the wifi for visitors I can run queries.

How to avoid "too many redirects" error when using readLines(url) in R?

I am trying to mine news articles from various sources by doing
site = readLines(link)
link being the url of the site I am trying to download. Most of the time this works but with some specific sources I get the error:
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") : too many redirects, aborting ...
Which I'd like to avoid but so far I had no success in doing so.
Replicating this is quite easy as virtually none of the New York Times links work
e.g. http://www.nytimes.com/2014/08/01/us/politics/african-leaders-coming-to-talk-business-may-also-be-pressed-on-rights.html
It seems like the NYT site forces redirects for cookie and tracking purposes. Looks like the built-in URL reader isn't able to deal with them correctly (not sure if it supports cookies which is probably the problem).
Anyway, you might consider using the RCurl package to access the file instead. Try
library(RCurl)
link = "http://www.nytimes.com/2014/08/01/us/politics/african-leaders-coming-to-talk-business-may-also-be-pressed-on-rights.html?_r=0"
site <- getURL(link, .opts = curlOptions(
cookiejar="", useragent = "Mozilla/5.0", followlocation = TRUE
))

Resources