Error when scraping 2019 data with nflscrapR. Just started getting it - r

Everything was working great up to last Tuesday. Ran it again this weekend and today, getting the error:
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
embedded nul in string: '\037‹\b'
install.packages("devtools")
devtools::install_github(repo = "maksimhorowitz/nflscrapR")
library(nflscrapR)
pbp_2019 <- scrape_season_play_by_play(2019, weeks = 9)
I expected to get the data as always, but this error above always pops up.
Any ideas?

I just redownloaded nflscrapR. I had to add 'force = TRUE' to the 'install_github()' command to get it to actually redownload.

Related

PDFPageCountError: Unable to get page count

I am trying to use pdf2image, but I am getting this error:
PDFPageCountError: Unable to get page count.
I/O Error: Couldn't open file 'C:\Users\user_name\Desktop\folder_name\folder2_name\folder3_name\007-084841-1 to 31 Dec'22': No error.
It is confusing as it doesn't give any error, it just says 'No error'
My code is:
doc = convert_from_path("C:\\Users\\user_name\\Desktop\\folder_name\\folder2_name\\folder3_name\\007-084841-1 to 31 Dec'22")
path, fileName = os.path.split("C:\\Users\\user_name\\Desktop\\folder_name\\folder2_name\\folder3_name\\007-084841-1 to 31 Dec'22")
fileBaseName, fileExtension = os.path.splitext(fileName)
for page_number, page_data in enumerate(doc):
txt = pytesseract.image_to_string(Image.fromarray(page_data)).encode("utf-8")
print("Page # {} - {}".format(str(page_number),txt))
Can anyone help me please?
I don't know what to try as the error message just says Unable to open...: No error

Error when trying to prase a HTTP-Request in R

im using R package httr to get a HTTP-Response for a specific link.
When trying to parse the content of the response im getting the Error:
Fehler in parse(text = script_content) : <text>:1:10: Unerwartete(s) '['
1: {"lines":[
Translated to enlgish it says something like this (sorry for my error messages being in German):
Error in parsing(text = script_content) : <text>1:10: Unexpected '['
1: {"lines":[
It seems as there is a problem with the format/encoding of the text. Here is my code:
script <-
GET(
url = "https://my_url.which_origin_is_not_important/my_script.R",
authenticate(username, pass)
)
script_content <- content(script, as = "text", encoding = "ISO-8859-1")
parsed_condent <- parse(text = script_content )
The value of script_content looks like this:
"{\"lines\":[{\"text\":\"################## FUNCTION ##################\"},{\"text\":\"\"},{\"text\":\"library(log4r)\"}],\"start\":0,\"size\":32,\"isLastPage\":true,\"limit\":500,\"nextPageStart\":null}"
Some more background to this operation: Im trying to source a code, which is currently inside of a private repository. I wrote the code myself i'm trying to source. I made sure, that the issue is not coming from within th code.
I got the solution from: Sourcing R files in a private github folder
Thanks for any advice!!

Using RCurl, how to download the "clone from github" zip file?

If I want to download a clone as a zip, it does a redirect.
zip.url = "https://github.com/MonteShaffer/humanVerse/archive/refs/heads/main.zip"
redirects to:
<html><body>You are being redirected.</body></html>
I am trying to using the RCurl library:
require(RCurl)
curl.fun = basicTextGatherer();
curl.ch = getCurlHandle();
x = getBinaryURL(zip.url, curl = curl.ch, headerfunction = curl.fun$update )
One windoze 10, throwing this error:
Error in function (type, msg, asError = TRUE) :
error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
I am assuming github is doing multiple redirects. I want to download the file as a binary 'zip'.
You have to set the curl option followlocation to TRUE, like this:
binary_blob <- RCurl::getBinaryURL(zip.url, .opts = list(followlocation = TRUE))
It might be easier to download the file instead with the following two options:
utils::download.file() comes with R and works for this.
zip.url <- "https://github.com/MonteShaffer/humanVerse/archive/refs/heads/main.zip"
download.file(zip.url, "main.zip")
The curl package has curl_download().
library(curl)
curl::curl_download(zip.url, "main2.zip")

Connecting using RAdwords library and using doAuth gives error

I have been trying to connect to Adwords Account using RAdwords, but I get the following error on doAuth():
Error in
rjson::fromJSON(RCurl::postForm("https://accounts.google.com/o/oauth2/token",
: STRING_ELT() can only be applied to a 'character vector', not a
'raw'
I have the correct credentials and Developer's token, but I am still unable to resolve the problem. I am using windows 7. The traceback is as follows:
> traceback()
4: .Call("fromJSON", json_str, unexpected.escape, simplify, PACKAGE = "rjson")
3: rjson::fromJSON(RCurl::postForm("https://accounts.google.com/o/oauth2/token",
.opts = opts, code = credlist$c.token, client_id = credlist$c.id,
client_secret = credlist$c.secret, redirect_uri = "urn:ietf:wg:oauth:2.0:oob",
grant_type = "authorization_code", style = "POST"))
2: loadToken(credentials)
1: doAuth()
Looked and tried all options from other similar questions, e.g.: using suggestions), I have also installed this version of RAdwords.
install_github('jburkhardt/RAdwords', ref = "refresh_token_raw_data")
Install the RAdwords package from the following GitHub branch containing a bug fix for your issue:
require(devtools)
install_github('jburkhardt/RAdwords', ref = "bugfix_char_to_raw")

setup_twitter_oauth, searchTwitter and Rscript

I run the following script using an installation of RStudio on a Linux-Server.
require(twitteR)
require(plyr)
setup_twitter_oauth(consumer_key='xxx', consumer_secret='xxx',
access_token='xxx', access_secret='xxx')
searchResults <- searchTwitter("#vds", n=15000, since = as.character(Sys.Date()-1), until = as.character(Sys.Date()))
head(searchResults)
tweetsDf = ldply(searchResults, function(t) t$toDataFrame())
write.csv(tweetsDf, file = paste("tweets_vds_", Sys.Date(), ".csv", sep = ""))
The script works fine, when I run it from the user-interface.
However, when I automatically run it via the terminal using crontab, I get the following error-message:
[1] "Using direct authentication"
Error in twInterfaceObj$getMaxResults :
could not find function "loadMethod"
Calls: searchTwitter -> doRppAPICall -> $
Execution halted
Why?

Resources