Importing Excel file using url using read.xls - r

I'm trying to use read.xls from gdata to import an Excel file directly into R. I'm on a Windows machine running 64 bit R.
I have checked my PATH variable for perl and I appear to have that set correctly, so that doesn't appear to be a problem. Here's my code, and I've attached my error below. Does anyone have any pointers on how I can get this done?
require(RCurl)
require(gdata)
url <- "https://dl.dropboxusercontent.com/u/27644144/NADAC%2020140101.xls"
test <- read.xls(url)
The error I'm getting is:
Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, :
Intermediate file 'C:\Users\Me\AppData\Local\Temp\RtmpeoJNxP\file338c26156d7.csv' missing!
In addition: Warning message:
running command '"C:\STRAWB~1\perl\bin\perl.exe" "C:/Users/Me/Documents/R/win-library/3.0/gdata/perl/xls2csv.pl" "https://dl.dropboxusercontent.com/u/27644144/NADAC%2020140101.xls" "C:\Users\Me\AppData\Local\Temp\RtmpeoJNxP\file338c26156d7.csv" "1"' had status 22
Error in file.exists(tfn) : invalid 'file' argument

#G.G is correct that read.xls does not support https. However, if you simply replace the https with http in the url you should be able to download the file.
Give this a try:
require(RCurl)
require(gdata)
url <- "http://dl.dropboxusercontent.com/u/27644144/NADAC%2020140101.xls"
test <- read.xls(url)

read.xls supports http and ftp but does not support https. Download it first and then use read.xls with the downloaded file.

Related

Download NASA satellite data using RCurl in R

I am trying to download a ncdf file using rCurl. Can anyone provide any advice on why this is not working?
require(RCurl)
require(ncdf4)
url <- "https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/Mapped/Seasonal_Climatology/4km/sst/"
filename <-"A20021722014263.L3m_SCSU_NSST_sst_4km.nc"
download.file(paste0(url, filename),destfile = paste0("~/Desktop/", filename), method="curl")
setwd("~/Desktop/")
files<-dir(pattern="*.nc")
f<-nc_open(files[1])
Error in R_nc4_open: NetCDF: Unknown file format
Error in nc_open(files[1]) :
Error in nc_open trying to open file A20021722014263.L3m_SCSU_NSST_sst_4km.nc
It appears that the file downloaded is an error file in XML format? If you open it in Notepad, you'll see it contains stuff like
Sorry, an error has occurred. Use the back button to return to the previous page or go to the Ocean Color Home Page
Are you sure that the filename you're wanting to download actually exists in that URL?

R ftpUplad error: cannot open the connection

I am trying to upload a data.frame called 'ftp_test' via ftpUpload command
library(RCurl)
ftpUpload("Localfile.html", "ftp://User:Password#FTPServer/Destination.html")
and am getting an error:
Error in file(what, "rb") : cannot open the connection
In addition: Warning message:
In file(what, "rb") :
cannot open file 'ftp_test': No such file or directory
Could anyone tell me what is the issue here? Can I actually use data.frame and upload from r global environment ?
If I can't use the data.frame is there any workaround?
Many thanks,
Artur
You problem is, that you are trying to send an R object with an file transfer protocol. Since you are saving it there, you have to tell how to save it. A workaround is to save it as a file, upload it and then delete it on your local afterwards. Also saving as R.History is fine, but you need to transfer the R object to a file in some way. This example is used with an open ftp sever (uploads get deleted immediately, but you can try if it works)
filename="test.csv"
write.csv(df, file=filename)
#use your path to the csv file here instead of ".~/test.csv", you can check with getwd()
ftpUpload("~/test.csv", paste("ftp://speedtest.tele2.net/upload/",filename, sep=""))
file.remove(filename)
Also make sure your server is running. You can try your code with the open ftp server.

Downloading NetCDF files with R: Manually works, download.file produces error

I am trying to download a set of NetCDF files from: ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/nwm.20180425/medium_range/
When I manually download the files I have no issues connecting, but when I use download.file and attempt to connect I get the following error:
Assertion failed!
Program: C:\Program Files\Rstudio\bin\rsession.exe
File: nc4file.c, Line 2771
Expression: 0
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
I have attempted to run the code in R without R studio and got the same result.
My abbreviated code is as followed:
library("ncdf4")
library("ncdf4.helpers")
download.file("ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/nwm.20180425/medium_range/nwm.t00z.medium_range.channel_rt.f006.conus.nc","c:/users/nt/desktop/nwm.t00z.medium_range.channel_rt.f006.conus.nc")
temp = nc_open("c:/users/nt/desktop/nwm.t00z.medium_range.channel_rt.f006.conus.nc")
Adding mode = 'wb' to the download.file arguments solves the issue for me. I've had the same problem when downloading PDFs
download.file("ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/nwm.20180425/medium_range/nwm.t00z.medium_range.channel_rt.f006.conus.nc","C:/teste/teste.nc", mode = 'wb')

Reading an online xlsx file into R

I am trying to download spreadsheets from AQR data library into R directly.
I have this link: http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx which prompts a download. However, when trying the following code:
> url1<-"http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx"
> download.file(url1,destfile="example.xlsx")
I get this error
trying URL 'http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx'
Error in download.file(url1, destfile = "example.xlsx") : cannot open URL 'http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx'
https://www.aqr.com/library/data-sets/value-and-momentum-everywhere-portfolios-monthly is the page from which I am trying to download data(under full set data link).
Could you provide some guidance?
It looks like that link redirects to https, which download.file does not support by default. If you have wget or curl installed you can use
download.file("https://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx",
"example.xlsx",
method = "wget")
or
download.file("https://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx",
"example.xlsx",
method = "curl")
These and other options are discussed at Download a file from HTTPS using download.file()
I'm not quite sure what is causing the problem for you, but the following worked for me:
library(XLConnect)
##
con <- "http://www.aqr.com/~/media/files/data-sets/value-and-momentum-everywhere-portfolios-monthly.xlsx"
download.file(con,"xlsxFile.xlsx",mode="wb")
##
newWB <- loadWorkbook(
file="xlsxFile.xlsx",
create=F)
##
R> getSheets(newWB)
[1] "VME Portfolios" "Definitions" "Data Sources" "Disclosures"
and here's a screenshot of the downloaded file:

Error when using getGEO() in package GEOquery

I'M running the following code in R:
library(GEOquery)
mypath <- "C:/Users/Farzin/Desktop/BIOC"
GDS1 <- getGEO('GDS1',destdir=mypath)
But I'm getting the following error:
Using locally cached version of GDS1 found here:
C:/Users/Farzin/Desktop/BIOC/GDS1.soft.gz
Error in read.table(con, sep = "\t", header = FALSE, nrows = nseries) :
invalid 'nlines' argument
Could anyone please tell me how I could get rid of this error?
I have had the same error using GEOquery (version 2.23.5) with R and Bioconductor from ubuntu (12.04), whatever GDS file I queried. Could it be that the GEOquery package is faulty ?
In my experience, getGEO is extremely finicky. I commonly experience issues connecting to the GEO server. If this happens during download, getGEO leaves a partial file. But since the partial file is there, when you try to re-download, it will use this cached, partially downloaded file, and run into the error you see (which you want, because its not the full file).
To solve this, delete the cached SOFT file and retry the download.

Resources