JSON URL From NBA Website Not Working Anymore - r

I've been working on this project that scrapes data from the nba.com stats website using R. A couple of months ago, I was able to use it easily, but now the url does not seem to work and I can't figure out why. Looking at the website, it doesn't seem like the url changed at all, but I can't access it via my browser.
library(rjson)
url <- "https://stats.nba.com/stats/scoreboardV2?DayOffset=0&LeagueID=00&gameDate=02%2F07%2F2020"
data_json <- fromJSON(file = url)
Is anyone else experiencing this problem?

It was a header related issue. The following fixed it:
url <- "https://stats.nba.com/stats/scoreboardV2?DayOffset=0&LeagueID=00&gameDate=02%2F07%2F2020"
headers = c(
`Connection` = 'keep-alive',
`Accept` = 'application/json, text/plain, */*',
`x-nba-stats-token` = 'true',
`User-Agent` = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
`x-nba-stats-origin` = 'stats',
`Sec-Fetch-Site` = 'same-origin',
`Sec-Fetch-Mode` = 'cors',
`Referer` = 'http://stats.nba.com/%referer%/',
`Accept-Encoding` = 'gzip, deflate, br',
`Accept-Language` = 'en-US,en;q=0.9'
)
res <- GET(url, add_headers(.headers = headers))
data_json <- res$content %>%
rawToChar() %>%
fromJSON()

Related

RDash - getting error while refresh the page and Upload the csv file

This is Dash-R code for Uploading csv file.
Following Error I while refresh the page:
error: non-character argument
request: 127.0.0.1 - ID_127.0.0.1 [15/Jul/2020:22:22:38 +0530] "POST /_dash-update-component HTTP/1.1" 500 0 "http://127.0.0.1:8050/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
while i am uploading csv file, i am getting following Error"
error: could not find function "base64_dec"
request: 127.0.0.1 - ID_127.0.0.1 [15/Jul/2020:22:23:08 +0530] "POST /_dash-update-component HTTP/1.1" 500 0 "http://127.0.0.1:8050/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
library(dash)
library(dashCoreComponents)
library(dashHtmlComponents)
library(dashTable)
app <- Dash$new()
app$layout(htmlDiv(list(
dccUpload(
id='upload-data',
children=htmlDiv(list(
'Drag and Drop or ',
htmlA('Select Files')
)),
style=list(
'width'= '100%',
'height'= '60px',
'lineHeight'= '60px',
'borderWidth'= '1px',
'borderStyle'= 'dashed',
'borderRadius'= '5px',
'textAlign'= 'center',
'margin'= '10px'
),
# Allow multiple files to be uploaded
multiple=TRUE
),
htmlDiv(id='output-data-upload')
)))
parse_contents = function(contents, filename, date){
content_type = strsplit(contents, ",")
content_string = strsplit(contents, ",")
decoded = base64_dec(content_string)
if('csv' %in% filename){
df = read.csv(utf8::as_utf8(decoded))
} else if('xls' %in% filename){
df = read.table(decoded, encoding = 'bytes')
} else{
return(htmlDiv(list(
'There was an error processing this file.'
)))
}
return(htmlDiv(list(
htmlH5(filename),
htmlH6(anytime(date)),
dashDataTable(df_to_list('records'),columns = lapply(colnames(df), function(x){list('name' = x, 'id' = x)})),
htmlHr(),
htmlDiv('Raw Content'),
htmlPre(paste(substr(toJSON(contents), 1, 100), "..."), style=list(
'whiteSpace'= 'pre-wrap',
'wordBreak'= 'break-all'
))
)))
}
app$callback(
output = list(id='output-data-upload', property = 'children'),
params = list(input(id = 'upload-data', property = 'contents'),
state(id = 'upload-data', property = 'filename'),
state(id = 'upload-data', property = 'last_modified')),
function(list_of_contents, list_of_names, list_of_dates){
if(is.null(list_of_contents) == FALSE){
children = lapply(1:length(list_of_contents), function(x){
parse_contents(list_of_contents[[x]], list_of_names[[x]], list_of_dates[[x]])
})
}
return(children)
})
app$run_server()
I am having the same problem following the dashr example to use the upload component to read a csv file. I detected a few lines that are not working properly, but I am still unabled to get a data-frame from the file in a staightforward way.
Regarding the error: could not find function "base64_dec", I found that jsonlite package has a function "base64_dec" that seems to do what is intended. You can indicate the package when calling the function:
decoded = jsonlite::base64_dec(content_string)
Regarding the error: non-character argument, it is generated in this line when loading the app, because "contents" is still empty:
#This gives an error if run before reading the data
content_type = strsplit(contents, ",")
#Anyway that should be like below, because the content are given in a list and you want the first element
content_type = strsplit(contents, ",")[[1]][1]
Dash runs the callback at the begining of the app, but here we need the function to execute after selecting a file. Here the condition in the if statement is not doing its job:
#Will execute the code in the if even before selecting data:
if(is.null(list_of_contents) == FALSE)
#Will exectue only when data is selected
if(length(list_of_contents[[1]])>0)
The main issue is that when you have the decoded binary file, read.csv can't read it (at least not how it is in the example code given, because its input is the filename). Something that partially worked for me is this readBin() example, but you need to be aware of the size of the table, which is not practical for my case, because it will be different every time.
This is the complete code I modified to solve the issues you found, but the core part of reading the CSV is not functional. I also modified the conditions to proceed if the file selected is a CSV of Excel file (because they were not working properly):
library(dashCoreComponents)
library(dashHtmlComponents)
library(dash)
library(anytime)
app <- Dash$new()
app$layout(htmlDiv(list(
dccUpload(
id='upload-data',
children=htmlDiv(list(
'Drag and Drop or ',
htmlA('Select Files')
)),
style=list(
'width'= '100%',
'height'= '60px',
'lineHeight'= '60px',
'borderWidth'= '1px',
'borderStyle'= 'dashed',
'borderRadius'= '5px',
'textAlign'= 'center',
'margin'= '10px'
),
# Allow multiple files to be uploaded
multiple=TRUE
),
htmlDiv(id='output-data-upload')
)))
parse_contents = function(contents, filename, date){
print("Inside function parse")
content_type = strsplit(contents, ",")[[1]][1]
content_string = strsplit(contents, ",")[[1]][2]
#print(content_string)
decoded = jsonlite::base64_dec(content_string)
#print(decoded)
if(grepl(".csv", filename, fixed=TRUE)){
print("csv file selected")
## Here a function to read a csv file from the binary data is needed
## Because read.csv asks for the file NAME.
## readBin() can read it, but you need to know the size of the table to parse properly
#as.data.frame(readBin(decoded, character()))
#df = read.csv(utf8::as_utf8(decoded))
} else if(grepl(".xlsx", filename, fixed=TRUE)){
##Also to read the Excel
df = read.table(decoded, encoding = 'bytes')
} else{
return(htmlDiv(list(
'There was an error processing this file.'
)))
}
return(htmlDiv(list(
htmlH5(filename),
htmlH6(anytime(date)),
dashDataTable(df_to_list('records'),columns = lapply(colnames(df), function(x){list('name' = x, 'id' = x)})),
htmlHr(),
htmlDiv('Raw Content'),
htmlPre(paste(substr(toJSON(contents), 1, 100), "..."), style=list(
'whiteSpace'= 'pre-wrap',
'wordBreak'= 'break-all'
))
)))
}
app$callback(
output = list(id='output-data-upload', property = 'children'),
params = list(input(id = 'upload-data', property = 'contents'),
state(id = 'upload-data', property = 'filename'),
state(id = 'upload-data', property = 'last_modified')),
function(list_of_contents, list_of_names, list_of_dates){
if(length(list_of_contents[[1]])>0){
print("Inside if")
children = lapply(1:length(list_of_contents), function(x){
parse_contents(list_of_contents[[x]], list_of_names[[x]], list_of_dates[[x]])
})
return(children)
}
})
app$run_server()
I hope they revise this example to make it work.

R: Extracting latitude, longitude and time from JSON file

I have a .json file (more than 100,000 lines) containing the following information:
POST /log?lat=36.804121354&lon=-1.270256482&time=2016-05-18T17:39:59.004Z
{ 'content-type': 'application/x-www-form-urlencoded',
'content-length': '29',
host: 'ip_address:port',
connection: 'Keep-Alive',
'accept-encoding': 'gzip',
'user-agent': 'okhttp/3.7.0' }
BODY: lat=36.804121354&lon=-1.270256482
POST /log?lat=36.804123256&lon=-1.270254711&time=2016-05-18T17:40:13.004Z
{ 'content-type': 'application/x-www-form-urlencoded',
'content-length': '29',
host: 'ip_address:port',
connection: 'Keep-Alive',
'accept-encoding': 'gzip',
'user-agent': 'okhttp/3.7.0' }
BODY: lat=36.804123256&lon=-1.270254711
POST /log?lat=36.804124589&lon=-1.270255641&time=2016-05-18T17:41:05.004Z
{ 'content-type': 'application/x-www-form-urlencoded',
'content-length': '29',
host: 'ip_address:port',
connection: 'Keep-Alive',
'accept-encoding': 'gzip',
'user-agent': 'okhttp/3.7.0' }
BODY: lat=36.804124589&lon=-1.270255641
.......
The above information repeats with updated latitude, longitude and time. Using R, how can I extract latitude, longitude and time from this file? and store them in a dataframe like this:
id lat lon time
1 36.804121354 -1.270256482 2016-05-18 17:39:59
2 36.804123256 -1.270254711 2016-05-18 17:40:13
3 36.804124589 -1.270255641 2016-05-18 17:41:05
It doesn't appear your data is strictly JSON. Since the requested data is all contained on the "Post" lines, an one solution is to filter those lines out and then parse them.
#Read lines
x<-readLines("test.txt")
#Find lines beginning with "POST"
posts<-x[grep("^POST", x)]
#Remove the prefix: "POST /log?"
posts<-sub("^POST /log\\?", "", posts)
#split remaining fields on the &
fields<-unlist(strsplit(posts, "\\&"))
#remove the prefixes ("lat=", "lon=", "time=")
fields<-sub("^.*=", "", fields)
#make a dataframe (assume the fields are always in the same order)
df<-as.data.frame(matrix(fields, ncol=3, byrow=TRUE), stringsAsFactors = FALSE)
names(df)<-c("lat", "lon", "time")
#convert the columns to the proper type.
df$lat<-as.numeric(df$lat)
df$lon<-as.numeric(df$lon)
df$time<-as.POSIXct(df$time, "%FT%T", tz="UTC")

How to get table from html form using rvest or httr?

I am using R, version 3.3.1. I am trying to scrap data from following web site:
http://plovila.pomorstvo.hr/
As you can see, it is a HTML form. I would like to choose "Tip objekta" (object type), for example "Jahta" (Yacht) and enter "NIB" (which is an integer, eg. 93567). You can try yourself; just choose "Jahta" and type 93567 in NIB field.
Method is POST, type application/x-www-form-urlencoded. I have tried 3 different approaches: using rvest, POST (httr package) and postForm (Rcurl). My rvest code is:
session <- html_session("http://plovila.pomorstvo.hr")
form <- html_form(session)[[1]]
form <- set_values(form, `ctl00$Content_FormContent$uiTipObjektaDropDown` = 2,
`ctl00$Content_FormContent$uiOznakaTextBox` = "",
`ctl00$Content_FormContent$uiNibTextBox` = 93567)
x <- submit_form(session, form)
If I run this code and get 200 status but I don't understand how can I get the table:
Additional step is to submit Detalji button and get additional information, but I can't see any information from x submit output.
I used the curlconverter package to take the "Copy as cURL" data from the XHR POST request and turn it automagically into:
httr::VERB(verb = "POST", url = "http://plovila.pomorstvo.hr/",
httr::add_headers(Origin = "http://plovila.pomorstvo.hr",
`Accept-Encoding` = "gzip, deflate",
`Accept-Language` = "en-US,en;q=0.8",
`X-Requested-With` = "XMLHttpRequest",
Connection = "keep-alive",
`X-MicrosoftAjax` = "Delta=true",
Pragma = "no-cache", `User-Agent` = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.34 Safari/537.36",
Accept = "*/*", `Cache-Control` = "no-cache",
Referer = "http://plovila.pomorstvo.hr/",
DNT = "1"), httr::set_cookies(ASP.NET_SessionId = "b4b123vyqxnt4ygzcykwwvwr"),
body = list(`ctl00$uiScriptManager` = "ctl00$Content_FormContent$ctl00|ctl00$Content_FormContent$uiPretraziButton",
ctl00_uiStyleSheetManager_TSSM = ";|635908784800000000:d29ba49:3cef4978:9768dbb9",
`ctl00$Content_FormContent$uiTipObjektaDropDown` = "2",
`ctl00$Content_FormContent$uiImeTextBox` = "",
`ctl00$Content_FormContent$uiNibTextBox` = "93567",
`__EVENTTARGET` = "", `__EVENTARGUMENT` = "",
`__LASTFOCUS` = "", `__VIEWSTATE` = "/wEPDwUKMTY2OTIzNTI1MA9kFgJmD2QWAgIDD2QWAgIBD2QWAgICD2QWAgIDD2QWAmYPZBYIAgEPZBYCZg9kFgZmD2QWAgIBDxAPFgYeDURhdGFUZXh0RmllbGQFD05heml2VGlwT2JqZWt0YR4ORGF0YVZhbHVlRmllbGQFDElkVGlwT2JqZWt0YR4LXyFEYXRhQm91bmRnZBAVBAAHQnJvZGljYQVKYWh0YQbEjGFtYWMVBAEwATEBMgEzFCsDBGdnZ2cWAQICZAIBDw8WAh4HVmlzaWJsZWdkFgICAQ8PFgIfA2dkZAICDw8WAh8DaGQWAgIBDw8WBB4EVGV4dGUfA2hkZAIHDzwrAA4CABQrAAJkFwEFCFBhZ2VTaXplAgoBFgIWCw8CCBQrAAhkZGRkZDwrAAUBBAUHSWRVcGlzYTwrAAUBBAUISWRVbG9za2E8KwAFAQQFBlNlbGVjdGRlFCsAAAspelRlbGVyaWsuV2ViLlVJLkdyaWRDaGlsZExvYWRNb2RlLCBUZWxlcmlrLldlYi5VSSwgVmVyc2lvbj0yMDEzLjMuMTExNC40MCwgQ3VsdHVyZT1uZXV0cmFsLCBQdWJsaWNLZXlUb2tlbj0xMjFmYWU3ODE2NWJhM2Q0ATwrAAcACyl1VGVsZXJpay5XZWIuVUkuR3JpZEVkaXRNb2RlLCBUZWxlcmlrLldlYi5VSSwgVmVyc2lvbj0yMDEzLjMuMTExNC40MCwgQ3VsdHVyZT1uZXV0cmFsLCBQdWJsaWNLZXlUb2tlbj0xMjFmYWU3ODE2NWJhM2Q0ARYCHgRfZWZzZGQWBB4KRGF0YU1lbWJlcmUeBF9obG0LKwQBZGZkAgkPZBYCZg9kFgJmD2QWIAIBD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAIDD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAIFD2QWAgIDDzwrAAgAZAIHD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAIJD2QWBAIDDzwrAAgAZAIFDzwrAAgAZAILD2QWBgIDDxQrAAI8KwAIAGRkAgUPFCsAAjwrAAgAZGQCBw8UKwACPCsACABkZAIND2QWBgIDDxQrAAI8KwAIAGRkAgUPFCsAAjwrAAgAZGQCBw8UKwACPCsACABkZAIPD2QWAgIDDxQrAAI8KwAIAGRkAhEPZBYGAgMPPCsACABkAgUPPCsACABkAgcPPCsACABkAhMPZBYGAgMPPCsACABkAgUPPCsACABkAgcPPCsACABkAhUPZBYCAgMPPCsACABkAhcPZBYGAgMPPCsACABkAgUPPCsACABkAgcPPCsACABkAhkPPCsADgIAFCsAAmQXAQUIUGFnZVNpemUCBQEWAhYLZGRlFCsAAAsrBAE8KwAHAAsrBQEWAh8FZGQWBB8GZR8HCysEAWRmZAIbDzwrAA4CABQrAAJkFwEFCFBhZ2VTaXplAgUBFgIWC2RkZRQrAAALKwQBPCsABwALKwUBFgIfBWRkFgQfBmUfBwsrBAFkZmQCHQ88KwAOAgAUKwACZBcBBQhQYWdlU2l6ZQIFARYCFgtkZGUUKwAACysEATwrAAcACysFARYCHwVkZBYEHwZlHwcLKwQBZGZkAiMPPCsADgIAFCsAAmQXAQUIUGFnZVNpemUCBQEWAhYLZGRlFCsAAAsrBAE8KwAHAAsrBQEWAh8FZGQWBB8GZR8HCysEAWRmZAILD2QWAmYPZBYCZg9kFgICAQ88KwAOAgAUKwACZBcBBQhQYWdlU2l6ZQIFARYCFgtkZGUUKwAACysEATwrAAcACysFARYCHwVkZBYEHwZlHwcLKwQBZGZkZIULy2JISPTzELAGqWDdBkCVyvvKIjo/wm/iG9PT1dlU",
`__VIEWSTATEGENERATOR` = "CA0B0334",
`__PREVIOUSPAGE` = "jGgYHmJ3-6da6PzGl9Py8IDr-Zzb75YxIFpHMz4WQ6iQEyTbjWaujGRHZU-1fqkJcMyvpGRkWGStWuj7Uf3NYv8Wi0KSCVwn435kijCN2fM1",
`__ASYNCPOST` = "true",
`ctl00$Content_FormContent$uiPretraziButton` = "Pretraži"),
encode = "form") -> res
You can see the result of that via:
content(res, as="text") # returns raw HTML
or
content(res, as="parsed") # returns something you can use with `rvest` / `xml2`
Unfortunately, this is yet another useless SharePoint website that "eGov" sites around the world have bought into as a good thing to do. That means you have to do trial and error to figure out which of those parameters is necessary since it's different on virtually every site. I tried a minimal set to no avail.
You may even have to issue a GET request to the main site first to establish a session.
But this should get you going in the right direction.

equivalence of -d parameter in curl in httr package of R

I'm following the official manual of opencpu package in R. In chapter 4.3 Calling a function It uses curl to test API:
curl http://your.server.com/ocpu/library/stats/R/rnorm -d "n=10&mean=100"
and the sample output is:
/ocpu/tmp/x032a8fee/R/.val
/ocpu/tmp/x032a8fee/stdout
/ocpu/tmp/x032a8fee/source
/ocpu/tmp/x032a8fee/console
/ocpu/tmp/x032a8fee/info
I can use curl to get similar result, but when I try to send this http request using httr package in R, I don't know how to replicate the result. Here is what I tried:
resp <- POST(
url = "localhost/ocpu/library/stats/R/rnorm",
body= "n=10&mean=100"
)
resp
the output is:
Response [HTTP://localhost/ocpu/library/stats/R/rnorm]
Date: 2015-10-16 00:51
Status: 400
Content-Type: text/plain; charset=utf-8
Size: 30 B
No Content-Type header found.
I guess I don't understand what's the equivalence of curl -d parameter in httr, how can I get it correct?
Try this :)
library(httr)
library(jsonlite)
getFunctionEndPoint <- function(url, format) {
return(paste(url, format, sep = '/'))
}
resp <- POST(
url = getFunctionEndPoint(
url = "https://public.opencpu.org/ocpu/library/stats/R/rnorm",
format = "json"),
body = list(n = 10, mean = 100),
encode = 'json')
fromJSON(rawToChar(resp$content))

RCurl JSON data to JIRA REST add issue

I'm trying to POST data to JIRA Project using R and I keep getting: Error Bad Request. At first I thought it must be the JSON format that I created. So I wrote the JSON to file and did a curl command from console (see below) and the POST worked just fine.
curl -D- -u fred:fred -X POST -d #sample.json -H "Content-Type: application/json" http://localhost:8090/rest/api/2/issue/
Which brings the issue to my R code. Can someone tell me what am I doing wrong with the RCurl postForm?
Source:
library(RJSONIO)
library(RCurl)
x <- list (
fields = list(
project = c(
c(key="TEST")
),
summary="The quick brown fox jumped over the lazy dog",
description = "silly old billy",
issuetype = c(name="Task")
)
)
curl.opts <- list(
userpwd = "fred:fred",
verbose = TRUE,
httpheader = c('Content-Type' = 'application/json',Accept = 'application/json'),
useragent = "RCurl"
)
postForm("http://jirahost:8080/jira/rest/api/2/issue/",
.params= c(data=toJSON(x)),
.opts = curl.opts,
style="POST"
)
rm(list=ls())
gc()
Here's the output of the response:
* About to connect() to jirahost port 80 (#0)
* Trying 10.102.42.58... * connected
* Connected to jirahost (10.102.42.58) port 80 (#0)
> POST /jira/rest/api/2/issue/ HTTP/1.1
User-Agent: RCurl
Host: jirahost
Content-Type: application/json
Accept: application/json
Content-Length: 337
< HTTP/1.1 400 Bad Request
< Date: Mon, 07 Apr 2014 19:44:08 GMT
< Server: Apache-Coyote/1.1
< X-AREQUESTID: 764x1525x1
< X-AUSERNAME: anonymous
< Cache-Control: no-cache, no-store, no-transform
< Content-Type: application/json;charset=UTF-8
< Set-Cookie: atlassian.xsrf.token=B2LW-L6Q7-15BO- MTQ3|bcf6e0a9786f879a7b8df47c8b41a916ab51da0a|lout; Path=/jira
< Connection: close
< Transfer-Encoding: chunked
<
* Closing connection #0
Error: Bad Request
You might find it easier to use httr which has been constructed with
the needs of modern APIs in mind, and tends to set better default
options. The equivalent httr code would be:
library(httr)
x <- list(
fields = list(
project = c(key = "TEST"),
summary = "The quick brown fox jumped over the lazy dog",
description = "silly old billy",
issuetype = c(name = "Task")
)
)
POST("http://jirahost:8080/jira/rest/api/2/issue/",
body = RJSONIO::toJSON(x),
authenticate("fred", "fred", "basic"),
add_headers("Content-Type" = "application/json"),
verbose()
)
If that doesn't work, you'll need to supply the output from a successful
verbose curl on the console, and a failed httr call in R.

Resources