I am trying to download programmatically files like this from an ftp.
The home page provides openly username ("fire") and password ("burnt") and I can download the files no problem from browser.
When I try to do the same in R using httr::GET()
library("httr")
GET(url = "ftp://fuoco.geog.umd.edu/gfed4/monthly/GFED4.0_MQ_200301_BA.hdf",
authenticate(user = "fire", password = "burnt"),
write_disk(file.path(tempdir(), "GFED4.0_MQ_200301_BA.hdf"),
overwrite = TRUE))
I get the following error
Error in curl::curl_fetch_disk(url, x$path, handle = handle) :
Timeout was reached: Connection time-out
I would greatly appreciate any idea to fix this problem, many thanks!
The problem seems to be that FTP isn't supported by library(httr):
Please see this, or more recent this.
I'd give library(RCurl) a go instead:
library(RCurl)
url <- "ftp://fuoco.geog.umd.edu/gfed4/monthly/GFED4.0_MQ_200301_BA.hdf"
content <- getBinaryURL(url, userpwd = "fire:burnt", ftp.use.epsv = FALSE)
writeBin(content, con = basename(url))
Related
I am trying to send the following API GET Request with the HTTR library in R:
GET('https://api.vitaldb.net/afd182c102c5af625d3f217280b3766d453d9e3f')
But I get the following error message
Error in curl::curl_fetch_memory(url, handle = handle) : Failed writing received data to disk/application
I have tested the specific endpoint in Postman where I am able to retrieve the corret data, but somehow the R command doesn't work.
Can anyone help me?
According to the API docs the file is GZip-compressed CSV format so you could do the following (though someone might know of a one-liner download and read)
library(data.table)
download.file(url = "https://api.vitaldb.net/afd182c102c5af625d3f217280b3766d453d9e3f.csv.gz",
destfile = file.path("data.csv.gz"))
d <- fread("data.csv.gz", sep = ',')
I receive the following error...
Error in b_wincred_i_get(target) :
Windows credential store error in 'get': Element not found.
...when running the following script in R v3.5.1 (R Studio v1.1.456) and keyring v1.1.0. I'm attempting this on a new setup so not sure if this could be as simple as a firewall issue or the like. The script errors out when attempting to get the password (key_get(json_data$service, json_data$user)). I've tried manually plugging in the service and username into the key_get method (instead of using the variables from the config file) but get the same error. The config file is a json file that holds all of the connection details except the password, which obviously gets retrieved from the Windows Credentials Vault. Any help in figuring out a fix for this is greatly appreciated.
library(RJDBC)
library(keyring)
library(jsonlite)
postgres.connection <- function(json_data){
print("Creating Postgres driver...")
pDriver <- JDBC(driverClass=json_data$driver, classPath="C:/Users/Drivers/postgresql-42.2.4.jar")
print("Connecting to Postgres...")
server <- paste("jdbc:postgresql://", json_data$host, ":", json_data$port, "/", json_data$dbname, sep="")
pConn <- dbConnect(pDriver, server, json_data$user, key_get(json_data$service, json_data$user))
return(pConn)
}
json_data <- fromJSON("C:/Users/Configs/Config.json", simplifyVector = TRUE, simplifyDataFrame = TRUE)
json_data_connection <- json_data$postgres$local_read
pc <- postgres.connection(json_data_connection)
You mentioned it's a new setup. Is the password already in credentials? To check, go to 'Control Panel'>'User Accounts'>'Credential Manager'>'Windows Credentials', is it under 'Generic Credentials'?
I can get the same error by running:
key_get('not-valid-service', 'some-user')
I'm trying to implement R in the workplace and save a bit of time from all the data churning we do.
A lot of files we receive are sent to us via SFTP as they contain sensitive information.
I've looked around on StackOverflow & Google but nothing seems to work for me. I tried using the RCurl Library from an example I found online but it doesn't allow me to include the port(22) as part of the login details.
library(RCurl)
protocol <- "sftp"
server <- "hostname"
userpwd <- "user:password"
tsfrFilename <- "Reports/Excelfile.xlsx"
ouptFilename <- "~/Test.xlsx"
url <- paste0(protocol, "://", server, tsfrFilename)
data <- getURL(url = url, userpwd=userpwd)
I end up getting the error code
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
embedded nul in string:
Any help would be greatly appreciated as this will save us loads of time!
Thanks,
Shan
Looks like a similar situation here: Using R to download SAS file from ftp-server
I'm no expert in r but there it looks like getBinaryUrl() worked instead of getURL() in the example given.
Hope that helps
M
Note that there are two packages, RCurl and rcurl. For RCurl, I used successfully keyfiles to connect via sftp:
opts <- list(
ssh.public.keyfile = pubkey, # file name
ssh.private.keyfile = privatekey, # filename
keypasswd <- keypasswd # optional password
)
RCurl::getURL(url=uri, .opts = opts, curl = RCurl::getCurlHandle())
For this to work, you need two create the keyfiles e.g. via putty or similar.
I too was having problems specifying the port options when using the getURI() and getURL() functions.
In order to specify the port, you simply add the port as port = #### instead of port(####). For example:
data <- getURI(url = url,
userpwd = userpwd,
port = 22)
Now, like #MarkThomas pointed out, whenever you get an encodoing error, try getBinaryURL() instead of getURI(). In most cases, this will allow you to download SAS files as well as .csv files econded in UTF-8 or LATIN1!!
I've been working on this issue for a few days now and even after contacting the site administrators, I've had no luck in solving it.
I would like to automate the download of a specific file from an ftp server without using any software besides R.
userpwd = "MyUserName:MyPassword"
url <- "ftp://arthurhou.pps.eosdis.nasa.gov/gpmdata/2014/04/01/imerg/3B-HHR.MS.MRG.3IMERG.20140401-S150000-E152959.0900.V03D.HDF5"
dat <- try(getURL(url, userpwd = userpwd,verbose=TRUE,ftp.use.epsv = FALSE))
When I run this, I get the error:
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
embedded nul in string: '‰HDF\r\n\032\n\0\0\0\0\0\b\b\0\004\0\020\0\0\0\0\0\0\0\0\0\0\0\0\0ÿÿÿÿÿÿÿÿÚá'\0\0\0\0\0ÿÿÿÿÿÿÿÿ\0\0\0\0\0\0\0\0`\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0OHDR\002,fÉ¿TbÉ¿TfÉ¿TbÉ¿Tà\002"\0\0\0\0\0\003\001\0\0\0\0\0\0\0ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ\n\002\0\001\0\0\0\0\006\027\0\0\0\0\001\004\0\0\0\0\0\0\0\0\004Grid[\001\0\0\0\0\0\0\025\034\0\004\0\0\0\003\002\0ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ\020\020\0\0\0\0\036`&\0\0\0\0\0{\003\0\0\0\0\0\0\0U\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\024î*aOHDR\002,fÉ¿TbÉ¿TfÉ¿TbÉ¿Tà\002"\0\0\0\0\0\003\v\0\0\0\0\0\0\0Ã\025\0\0\0\0\0\0U\026\0\0\0\0\0\0{\026\0\0\0\0\0\0\n\002\0\001\0\0\0\0\025\034\0\004\0\0\0\003\001\0ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ\020\020\0\0\0\0™c&\0\0\0\0\0\034\001\0\0\0\0\0\0\0r\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
I've tried removing the nulls from the initial link, i.e. url <- "ftp://arthurhou.pps.eosdis.nasa.gov%2Fgpmdata%2F2014%2F04%2F01%2Fimerg%2F3B-HHR.MS.MRG.3IMERG.20140401-S213000-E215959.1290.V03D.HDF5" yet this returns the same error as before.
If anyone would like to try this for themselves, you can register an email at: http://pmm.nasa.gov/data-access/downloads/gpm, and then use the email as the username and password.
This worked for me:
library(httr)
url <- "ftp://arthurhou.pps.eosdis.nasa.gov/gpmdata/2014/04/01/imerg/3B-HHR.MS.MRG.3IMERG.20140401-S150000-E152959.0900.V03D.HDF5"
output_file <- "3B-HHR.MS.MRG.3IMERG.20140401-S150000-E152959.0900.V03D.HDF5"
my_email <- "someone#example.com"
GET(url, authenticate(my_email, my_email),
write_disk(output_file))
I was using one of my favorite R packages today to read data from a google spreadsheet. It would not work. This problem is occurring on all my machines (I use windows) and it appears to be a new problem. I am using Version: 0.4-1 of RGoogleDocs
library(RGoogleDocs)
ps <-readline(prompt="get the password in ")
sheets.con = getGoogleDocsConnection(getGoogleAuth("fxxxh#gmail.com", ps, service ="wise"))
ts2=getWorksheets("OnCall",sheets.con)
And this is what I get after running the last line.
Error in curlPerform(curl = curl, .opts = opts, .encoding = .encoding) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
I did some reading and came across some interesting, but not useful to me at least, information.
When I try to interact with a URL via https, I get an error of the form
Curl: SSL certificate problem, verify that the CA cert is OK
I got the very big picture message but did not know how to implement the solution in my script. I dropped the following line before getWorksheets.
x = getURLContent("https://www.google.com", ssl.verifypeer = FALSE)
That did not work so I tried
ts2=getWorksheets("OnCall",sheets.con,ssl.verifypeer = FALSE)
That also did not work.
Interestingly enough, the following line works
getDocs(sheets.con,folders = FALSE)
What do you suggest I try to get it working again? Thanks.
I no longer have this problem. I do not quite remember the timeline of exactly when I overcame the problem and cannot remember who helped me get here but here is a typical session which works.
library(RGoogleDocs)
if(exists("ps")) print("got password, keep going") else ps <-readline(prompt="get the password in ") #conditional password asking
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
sheets.con = getGoogleDocsConnection(getGoogleAuth("fjh#gmail.com", ps, service ="wise"))
#WARNING: this would prevent curl from detecting a 'man in the middle' attack
ts2=getWorksheets("name of workbook here",sheets.con)
names(ts2)
sheet.1 <-sheetAsMatrix(ts2$"Sheet 1",header=TRUE, as.data.frame=TRUE, trim=TRUE) #Get one sheet
other <-sheetAsMatrix(ts2$"whatever name of tab",header=TRUE, as.data.frame=TRUE, trim=TRUE) #Get other sheet
Does it help you?
Maybe you don't have the certificate bundle installed. I installed those on OS X. You can also find them on the curl site