Accessing SharePoint in R - r

My script happened to get errors with accessing SharePoint. It used to work.
sp_con = sp_connection("https://asdf.sharepoint.com/sites/staff",
credentialFile = "H:/SharePoint API/creds.yml", Office365 = T)
The error was
Error in sp_connection("https://asdf.sharepoint.com/sites/staff", :
Receiving access cookies failed.
In addition: Warning message:
In readLines(file) :
incomplete final line found on 'Y:/Operations/SharePoint API/creds.yml'
I googled the warning message and found solutions to fix it. But still got the access cookies error. Thanks in advance for any idea!

Related

Web Scraping with R: error related to reset of the connection with server

I have a problem with obtaining data from specific website - when trying to download raw website data with R 3.6.3 using following example code:
website_raw <- readLines("https://tge.pl/gaz-rdn?dateShow=09-02-2022")
The result I got is:
Error in file(con, "r") : cannot open the connection In addition: Warning message: In file(con, "r") : InternetOpenUrl failed: 'the connection with the server was reset'
readLines() method used to work fine on this website but from one week on it fails. I've tried also download.file() method: at the beginning the result was the same (error, connection reset) but after setting options(download.file.method = "libcurl"), website file starts to download but then it suddenly stops with information:
trying URL 'https://tge.pl/gaz-rdn?dateShow=09-02-2022'
Error in download.file("https://tge.pl/gaz-rdn?dateShow=09-02-2022", "test.html") :
cannot open URL 'https://tge.pl/gaz-rdn?dateShow=09-02-2022'
In addition: Warning message:
In download.file("https://tge.pl/gaz-rdn?dateShow=09-02-2022", "test.html") :
URL 'https://tge.pl/gaz-rdn?dateShow=09-02-2022': status was 'Failure when receiving data from the peer'
I've tried also disabling Use Internet Explorer library/proxy for HTTP in Rstudio Global Options but it didn't help. Another solution that I've tested was read_html() from rvest package - getting following error:
Error in open.connection(x, "rb") : Send failure: Connection was reset
Downloading data from other websites works fine though, with all considered methods.
Is there any way I can download data from this website with R?
Any kind of help or suggestion will be highly appreciated

how to resolve connection issue with URL when using `open` function

In order to find some features that I need, I want to establish a connection to a website using open(mycon, "r"). To do this, I used the code below which is provided by #Dunois:
myx <- httr::HEAD(example)$url
mycon <- url(myx)
open(mycon, "r")
where example is a link to a website. This code works perfectly for all websites; however, in some unique cases like "https://www.pixilink.com/140079#mode=tour" or "https://www.pixilink.com/141152#mode=0" it doesn't work. These websites exist and I check them in my browser and I am not sure why the connection cannot be established. The error message I get is:
Error in open.connection(mycon, "r") : cannot open the connection In addition: Warning message: In open.connection(mycon, "r") : cannot open URL 'https://www.pixilink.com/140079#mode=tour': HTTP status was '400 Bad Request'
I appreciate it if you can shed light on this and clarify why I get this error message?

Why there is database connection issue in RNCEP package

I am trying to use "RNCEP" package in R studio. I ran following code
install.packages("RNCEP", dependencies=TRUE)
library(RNCEP)
wx.extent <- NCEP.gather(variable= 'air', level=850, months.minmax=c(8,9),
years.minmax=c(2006,2007), lat.southnorth=c(50,55), lon.westeast=c(0,5),
reanalysis2 = FALSE, return.units = TRUE)
I got error messages as:
trying URL
'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/pressure/air.2006.nc.das'
Content length 660 bytes
Error in NCEP.gather.pressure(variable = variable, months.minmax =
months.minmax, :
There is a problem connecting to the NCEP database with the
information provided.
Try entering
http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/pressure/air.2006.nc.das
into a web browser to obtain an error message.
In addition: Warning messages:
1: In
download.file(paste("http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis",
: cannot open URL
'http://www.cfauth.com/?cfru=aHR0cDovL3d3dy5lc3JsLm5vYWEuZ292L3BzZC90aHJlZGRzL2RvZHNDL0RhdGFzZXRzL25jZXAucmVhbmFseXNpcy9wcmVzc3VyZS9haXIuMjAwNi5uYy5kYXM=':
HTTP status was '401 Unauthorized'
Please suggest me the correct syntax to download NCEP data.
Thanks
Sam

“Error in open.connection(x, "rb") : Timeout was reached”

When I do some webscraping (using a for loop to scrap multiple pages), sometimes, after scraping the 35th out of 40 pages, I have the following error:
“Error in open.connection(x, "rb") : Timeout was reached”
And sometimes I receive in addition this message:
“In addition: Warning message: closing unused connection 3”
Below a list of things I would like to clarify:
1) I have read it might need to define explicitly the user agent. I have tried that with:
read_html(curl('www.link.com', handle = curl::new_handle("useragent" = "Mozilla/5.0")))
but it did not change anything.
2) I noticed that when I turn on a VPN, and change location, sometimes my scraping works without any error. I would like to understand why?
3) I have also read it might depend of the proxy. How would like to understand how and why?
4) In addition to the error I have, I would like to understand this warning, has it might be a clue that leads to understand the error:
Warning message: closing unused connection 3
Does that mean that when I am doing webscraping I should somehow at the end call a function to close a connection?
I have already read the following posts on stackoverflow but there is no clear resolution:
Iterating rvest scrape function gives: "Error in open.connection(x, "rb") : Timeout was reached"
rvest Error in open.connection(x, "rb") : Timeout was reached
Error in open.connection(x, "rb") : Couldn't connect to server
Did you try this?
https://stackoverflow.com/a/38463559
library(rvest)
url = "http://google.com"
download.file(url, destfile = "scrapedpage.html", quiet=TRUE)
content <- read_html("scrapedpage.html")

gtrendsR error HTTP 410

I am new to use a package, gtrendsR, running in R 3.4.1, windows 10.
I succeeded in gconnect, but i get the following error message for any types of query passing to gtrends like below.
library(gtrendsR)
gconnect(usr=my_user_name,psw=my_password)
google.trends = gtrends(c("NHL"), geo="US",start_date="2017-01-01")
Error: Not enough search volume. Please change your search terms.
In addition: Warning message:
In request_GET(x, url, ...) : Gone (HTTP 410).
Anybody has some ideas to solve the problems?

Resources