googleCloudStorageR, gcs_save() works but gcs_load() does not - r

We have already authenticated and now have the following code snippet
googleCloudStorageR::gcs_save(
iris,
file = 'bucket-folder/iris.rda',
bucket = 'our-gcs-bucket'
)
googleCloudStorageR::gcs_load(
file = 'bucket-folder/iris.rda',
bucket = 'our-gcs-bucket'
)
Here, gcs_save() works fine to save the RDA in GCS, but gcs_load() does not work. We receive the error:
Error in curl::curl_fetch_disk(url, x$path, handle = handle): Failed to open file C:\Users\myname\path-to-file\iris.rda.
Request failed [ERROR]. Retrying in 1 seconds...
Error in curl::curl_fetch_disk(url, x$path, handle = handle): Failed to open file C:\Users\myname\path-to-file\iris.rda.
Request failed [ERROR]. Retrying in 1.3 seconds...
Error in curl::curl_fetch_disk(url, x$path, handle = handle) :
Failed to open file C:\Users\myname\path-to-file\iris.rda.
Error: Request failed before finding status code: Failed to open file C:\Users\myname\path-to-file\iris.rda.```
I am confused as to why gcs_load() appears to be attempting to open the file locally, rather than grabbing it from our GCS bucket. Are we using the gcs_load() function wrong here? How can we retrieve our saved RDA from GCS?
Edit: Also perhaps helpful and worth noting, but iris.rda saved in GCS is listed as being saved with Type text/plain per the GCS UI. Shouldn't the type here be something like RDA or RData, rather than text/plain?

Sorry missed this at the time. Its due to you saving the object with a folder name that GCS just regards as a / in the filename, then when you download it again it tries to put it in that folder structure but the folder does not exist.
To fix you could create the folder locally with dir.create():
googleCloudStorageR::gcs_save(
iris,
file = 'bucket-folder/iris.rda',
bucket = 'our-gcs-bucket'
)
dir.create("bucket-folder")
googleCloudStorageR::gcs_load(
file = 'bucket-folder/iris.rda',
bucket = 'our-gcs-bucket'
)
✓ Saved bucket-folder/iris.rda to bucket-folder/iris.rda ( 1.1 Kb )
[1] TRUE
Or you could specify the file location to be in an existing folder location (e.g. without the folder name) with the argument saveToDisk:
googleCloudStorageR::gcs_load(
file = 'bucket-folder/iris.rda',
saveToDisk = "iris.rda",
bucket = 'our-gcs-bucket'
)
✓ Saved bucket-folder/iris.rda to iris.rda ( 1.1 Kb )
[1] TRUE

Related

R CMD check fails with ubuntu when trying to download file, but function works within R

I am writing an R package and one of its functions download and unzips a file from a link (it is not exported to the user, though):
download_f <- function(download_dir) {
utils::download.file(
url = "https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php",
destfile = file.path(download_dir, "fines.rar"),
mode = 'wb',
method = 'libcurl'
)
utils::unzip(
zipfile = file.path(download_dir, "fines.rar"),
exdir = file.path(download_dir)
)
}
This function works fine with me when I run it within some other function to compile an example in a vignette.
However, with R CMD check in github action, it fails consistently on ubuntu 16.04, release and devel. It [says][1]:
Error: Error: processing vignette 'IBAMA.Rmd' failed with diagnostics:
cannot open URL 'https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php'
--- failed re-building ‘IBAMA.Rmd’
SUMMARY: processing the following file failed:
‘IBAMA.Rmd’
Error: Error: Vignette re-building failed.
Execution halted
Error: Error in proc$get_built_file() : Build process failed
Calls: <Anonymous> ... build_package -> with_envvar -> force -> <Anonymous>
Execution halted
Error: Process completed with exit code 1.
When I run devtools::check() it never finishes running it, staying in "creating vignettes" forever. I don't know if these problems are related though because there are other vignettes on the package.
I pass the R CMD checks with mac os and windows. I've tried switching the "mode" and "method" arguments on utils::download.file, but to no avail.
Any suggestions?
[1]: https://github.com/datazoompuc/datazoom.amazonia/pull/16/checks?check_run_id=2026865974
The download fails because libcurl tries to verify the webservers certificate, but can't.
I can reproduce this on my system:
trying URL 'https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php'
Error in utils::download.file(url = "https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php", :
cannot open URL 'https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php'
In addition: Warning message:
In utils::download.file(url = "https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php", :
URL 'https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php': status was 'SSL peer certificate or SSH remote key was not OK'
The server does not allow you to download from http but redirects to https, so the only thing to do now is to tell libcurl to not check the certificate and accept what it is getting.
You can do this by specifying the argument -k to curl
download_f <- function(download_dir) {
utils::download.file(
url = "https://servicos.ibama.gov.br/ctf/publico/areasembargadas/downloadListaAreasEmbargadas.php",
destfile = file.path(download_dir, "fines.rar"),
mode = 'wb',
method = 'curl',
extra = '-k'
)
utils::unzip(
zipfile = file.path(download_dir, "fines.rar"),
exdir = file.path(download_dir)
)
}
This also produces some download progress bar, you can silence this by setting extra to -k -s
This now opens you up to a Machine In The Middle Attack. (You possibly already are attacked this way, there is no way to check without verifying the current certificate with someone you know at the other side)
So you could implement an extra check, e.g. check the sha256sum of the downloaded file and see if it matches what you expect to receive before proceeding.
myfile <- system.file("fines.rar")
hash <- sha256(file(myfile))

R downloading file from S3

I need to download a file from an S3 bucket hosted by my company, so far I am able to retrieve data using the paws package:
library("aws.s3")
library("paws")
paws::s3(config = list(endpoint = "myendpoint"))
mycsv_raw <- s3$get_object(Bucket = "mybucket", key="myfile.csv")
mycsv <- rawToChar(mycsv_raw$Body)
write.csv(mycsv)
However, this is not good because I need to manually convert the raw file to a csv - and that might be more difficult for other types of files. Is there not a way to just download the file locally directly as a csv ?
When I try using aws.s3 I get an error in curl::curl_fetch_memory(url, handle = handle) : could not resolve host xxxx do you have any idea how to make that work? I am of course in a locked down corporate environment... But I am using the same endpoint in both cases so why does it work with one ane not the other?
Sys.setenv(
AWS_S3_ENDPOINT = "https://xxxx"
)
test <- get_object(object = "myfile.csv", bucket = "mybucket",
file = "mydownloadedfile.csv")
Error in curl::curl_fetch_memory(url, handle = handle) : could not resolve host xxxx

Knitting: Error: pandoc document conversion failed with error 61

Problem
Our End User fails to produce html files, gets this error:
Error: pandoc document conversion failed with error 61
Execution halted
TS Performed
We set up the proxy for a previous error message.
This previous error was:
pandoc.exe: Could not fetch \\HHBRUNA01.hq.corp.eurocontrol.int\alazarov$\R\win-library\3.5\rmarkdown\rmd\h\jquery\jquery.min.js
ResponseTimeout
Error: pandoc document conversion failed with error 67
Execution halted
For this we added "self_contained: no" to RProfile.Site>
We also tried "Self_Contained: yes" .
Current Error Message
Could not fetch http://?/UNC/server.contoso.int/username$/R/win-library/3.5/rmarkdown/rmd/h/default.html
HttpExceptionRequest Request {
host = ""
port = 80
secure = False
requestHeaders = []
path = "/"
queryString = "?/UNC/server.contoso.int/username$/R/win-library/3.5/rmarkdown/rmd/h/default.html"
method = "GET"
proxy = Just (Proxy {proxyHost = "pac.contoso.int", proxyPort = 9512})
rawBody = False
redirectCount = 10
responseTimeout = ResponseTimeoutDefault
requestVersion = HTTP/1.1
}
(InvalidDestinationHost "")
Error: pandoc document conversion failed with error 61
Execution halted
I had the same issue on Windows 10, with user path located on a network drive.
Could not fetch http://?/UNC/...
Error: pandoc document conversion failed with error 61
The solution was to run R as administrator, remove the package 'rmarkdown', and reinstall it.
Additional to the answer by Malte: When you do not have administrator rights you can just change the library directory towards a directory where you have full rights, C: for example. The default option is your network folder "?/UNC/server.contoso.int/username$/R/win-library/3.5/rmarkdown/rmd/h/default.html", where you have not sufficient rights and therefore R can not knit the markdown file.
In RStudio, click on Tools>Install Packages.. Under "Install to library" you can see the default option (in your case it should be "?/UNC/server.contoso.int/username$/R/win-library/3.5/rmarkdown/rmd/h/default.html"). The second option here should be "C:/Program Files/R/R-3.6.2/library".
To change this order, i.e. to make the "C:/Program Files/R/R-3.6.2/library" folder the default folder, you have to use the following code (execute the code in a new R file) :
bothPaths <- .libPaths() # extract both paths
bothPaths <- c(bothPaths [2], bothPaths [1]) # change order
.libPaths(bothPaths ) # modify the order
After that, you might have to install the markdown package again. This time, it will be directly installed into the "C:/Program Files/R/R-3.6.2/library" folder.
Now, knitting should be working, because R will use the package straight from a folder where you have full rights.
Aand issue was resolved. Someone changed a rule on the server hosting the files without documenting/logging....

r unable to download file from server

I am trying to download a pdf file that is stored in an internal server.
The url for this file is like this below
file_location <- "file://dory.lisa.org/research/data/test.pdf"
I tried downloading this file using the download.file option
download.file(file_location, "test.pdf",method='curl')
and i am getting an error.
curl: (37) Couldn't open file /research/data/test.pdf
Warning message:
In download.file(file_location, "test.pdf", method = "curl") :
download had nonzero exit status
I tried
url <- ('http://cran.r-project.org/doc/manuals/R-intro.pdf')
download.file(url, 'introductionToR.pdf')
And i have no problem downloading this file, but somehow it shows an error when I try to use the same approach to download a file on my server.
I'm guessing that the file does not exist at that location on your local drive. When I executed the couple of lines that downloaded from CRAN I get a pdf file in my User directory/folder. I then get success with this code:
url <- ('file://~/introductionToR.pdf')
download.file(url, 'NewintroductionToR.pdf')

R XBRL IO Error when attempting to read from SEC web site and local file

I am having trouble with the XBRL library examples for reading XBRL documents from either the SEC website and from my local hard drive.
This code first attempts to do the read from the SEC site as written in the example in the pdf file for the XBRL library, and second tries to read a file saved locally:
# Following example from XBRL pdf doc - read xml file directly from sec web site
library(XBRL)
inst <- "http://www.sec.gov/Archives/edgar/data/1223389/000122338914000023/conn-20141031.xml"
options(stringsAsFactors = FALSE)
xbrl.vars <- xbrlDoAll(inst)
# attempt 2 - save the xml file to a local directory - so no web I/O
localdoc <- "~/R/StockTickers/XBRLdocs/aapl-20160326.xml"
xbrl.vars <- xbrlDoAll(localdoc)
Both of these throw an IO error. The first attempt to read from the SEC site results in this and crashes my RStudio instance:
error : Unknown IO error
I/O warning : failed to load external entity "http://www.sec.gov/Archives/edgar/data/1223389/000122338914000023/conn-20141031.xml"
So I restart RStudio, re-load XBRL library and try the second attempt, to read from a local file give this error:
I/O warning : failed to load external entity "~/R/StockTickers/XBRLdocs/aapl-20160326.xml"
I am using R version 3.3.0 (2016-05-03)
I hope I am missing something obvious to somebody, I am just not seeing it. Any help would be appreciated.

Resources