Download NASA satellite data using RCurl in R - r

I am trying to download a ncdf file using rCurl. Can anyone provide any advice on why this is not working?
require(RCurl)
require(ncdf4)
url <- "https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/Mapped/Seasonal_Climatology/4km/sst/"
filename <-"A20021722014263.L3m_SCSU_NSST_sst_4km.nc"
download.file(paste0(url, filename),destfile = paste0("~/Desktop/", filename), method="curl")
setwd("~/Desktop/")
files<-dir(pattern="*.nc")
f<-nc_open(files[1])
Error in R_nc4_open: NetCDF: Unknown file format
Error in nc_open(files[1]) :
Error in nc_open trying to open file A20021722014263.L3m_SCSU_NSST_sst_4km.nc

It appears that the file downloaded is an error file in XML format? If you open it in Notepad, you'll see it contains stuff like
Sorry, an error has occurred. Use the back button to return to the previous page or go to the Ocean Color Home Page
Are you sure that the filename you're wanting to download actually exists in that URL?

Related

Downloading and unzipping GitHub zipped files directly in R

I am trying to download and unzip a folder of files from GitHub into R. I can manually download the file at https://github.com/dylangomes/SO/blob/main/Shape.zip and then extract all files in working directory, but I'd like to work directly from R.
utils::unzip("https://github.com/dylangomes/SO/blob/main/Shape.zip")
# Warning message:
# In utils::unzip("https://github.com/dylangomes/SO/blob/main/Shape.zip", :
# error 1 in extracting from zip file
It says it is a warning message, although nothing has been downloaded or unzipped into my wd.
I can download the file to my machine:
utils::download.file("https://github.com/dylangomes/SO/blob/main/Shape.zip")
But I get the same message with the unzip function:
utils::unzip("Shape.zip")
And the downloaded file cannot manually be extracted. Here, I get the error that the compressed folder is empty. The unzip line works on the manually downloaded .zip file, which tells me something is wrong with the download.file line.
So if I add raw=TRUE to the end (which can make a difference in downloading data from GitHub):
utils::download.file("https://github.com/dylangomes/SO/blob/main/Shape.zip?raw=TRUE","Shape.zip")
utils::unzip("Shape.zip")
I get a different warning with, similarly, nothing being executed:
Warning message:
In utils::unzip("Shape.zip") : internal error in 'unz' code
I have tried most of the answers at Using R to download zipped data file, extract, and import data, but they appear to be for single files that are zipped and aren't helping here. I've tried the answers at r function unzip error 1 in extracting from zip file, which mentions the same warning message I am getting, but none of the solutions work in this case.
Any idea of what I am doing wrong?
You need to use:
download.file(
"https://github.com/dylangomes/SO/blob/main/Shape.zip?raw=TRUE",
"Shape.zip",
mode = "wb"
)
Without the query string ?raw=TRUE you are downloading the webpage and not the file.
(For Windows) R will use mode = "wb" by default when it detects from the end of the URL that certain file formats, including .zip, are being downloaded. However, the URL finishing with a query string instead of a file format means the check fails so you need to set the mode explicitly.

Download .pdf file to R, getting error message

I'm having trouble download a .pdf from the internet into Rstudio. I would like to analyse the .pdf using the pdftools package. I have a directory called files that I want the .pdf to go to. I'm using this code.
download.file('https://www2.gov.scot/Resource/Doc/352649/0118638.pdf', 'files')
I get this error:
Warning messages:
1: In download.file("https://www2.gov.scot/Resource/Doc/352649/0118638.pdf", :
URL https://www2.gov.scot/Resource/Doc/352649/0118638.pdf: cannot open destfile 'files', reason 'Is a directory'
2: In download.file("https://www2.gov.scot/Resource/Doc/352649/0118638.pdf", :
download had nonzero exit status
Is there way to get around this message?
The destfile has to be the filename (not the directory name) for the downloaded file.
For example, if we were to download the file above and save it as "Commission.pdf" in the files folder we would do the following:
download.file(url='https://www2.gov.scot/Resource/Doc/352649/0118638.pdf',
destfile="files/Commission.pdf")
You're passing in file to the destfile, which prompts R to throw the error warning that the argument you specified is a directory.
You miss the function assignature. It is
download.file(url, destfile, ...)
Therefore, when you're using download.file('https://www2.gov.scot/Resource/Doc/352649/0118638.pdf', 'files'), you are downloading the file https://www2.gov.scot/Resource/Doc/352649/0118638.pdf and saving it with the name files.
What you need to do is to modify the second argument to encopass the complete file path. It can be something like this:
download.file('https://www2.gov.scot/Resource/Doc/352649/0118638.pdf', 'files/0118638.pdf')

r unable to download file from server

I am trying to download a pdf file that is stored in an internal server.
The url for this file is like this below
file_location <- "file://dory.lisa.org/research/data/test.pdf"
I tried downloading this file using the download.file option
download.file(file_location, "test.pdf",method='curl')
and i am getting an error.
curl: (37) Couldn't open file /research/data/test.pdf
Warning message:
In download.file(file_location, "test.pdf", method = "curl") :
download had nonzero exit status
I tried
url <- ('http://cran.r-project.org/doc/manuals/R-intro.pdf')
download.file(url, 'introductionToR.pdf')
And i have no problem downloading this file, but somehow it shows an error when I try to use the same approach to download a file on my server.
I'm guessing that the file does not exist at that location on your local drive. When I executed the couple of lines that downloaded from CRAN I get a pdf file in my User directory/folder. I then get success with this code:
url <- ('file://~/introductionToR.pdf')
download.file(url, 'NewintroductionToR.pdf')

R download.file, downloading excel file does not work

I try to download an excel file using download.file().
If I go directly to the link using the browser, I can download the file without problems.
However, using download.file does only download a broken file with Excel error: "The file you are trying to open is in a different format than specified by the file extension."
Here is my code:
url <- "http://obieebr.banrep.gov.co/analytics/saw.dll?Download&Format=excel2007&Extension=.xlsx&BypassCache=true&path=%2Fshared%2fSeries%20Estad%c3%adsticas%2F1.%20Tasa%20Interbancaria%20%28TIB%29%2F1.1.TIB_Serie%20hist%C3%B3rica%20IQY&lang=es&NQUser=publico&NQPassword=publico&SyncOperation=1"
download.file(url, destfile = paste0(base_dir, "test.xls"), mode = "wb", method="libcurl")
Any ideas how to download this file?
Many thanks for your help!
Try this, it works for me:
download.file(url,destfile = "./second.xlsx",mode = "wb")
The file you are trying to download is simply not an excel file. Actually what you obtain is an html file (try to change the file extension to '.html', then open in your browser). So your code is not the problem.

How to download file from internet via R

I have an url, and I want to download the file via R, I notice that download.file would be helpful, but my problem seems different:
url <- "http://journal.gucas.ac.cn/CN/article/downloadArticleFile.do?attachType=PDF&id=11771"
destfile <- "myfile.pdf"
download.file(url, destfile)
It doesn't work! I notice that if my url is in the form of xxx.pdf, then the code above is no problem, otherwise the file that is downloaded is corrupt.
Does anyone know how to solve this problem?
Setting the mode might be required to treat the file as binary data while saving it. If I leave that argument out, I get a blank file, but this way works for me:
url <- "http://journal.gucas.ac.cn/CN/article/downloadArticleFile.do?
attachType=PDF&id=11771"
destfile <- "myfile.pdf"
download.file(url, destfile, mode="wb")
I am trying to download an nc file with R. It downloads well but I get this error when trying to open it:
Error in R_nc4_open: NetCDF: Unknown file format Error in
nc_open("SM_D2010323_Map_SATSSS_data_1day.nc") : Error in nc_open
trying to open file SM_D2010323_Map_SATSSS_data_1day.nc
(return_on_error= FALSE )
url <- "https://www.star.nesdis.noaa.gov/data/socd1/coastwatch/products/miras/nc/SM_D2010323_Map_SATSSS_data_1day.nc"
destfile <- "***/SM_D2010323_Map_SATSSS_data_1day.nc"
download.file(url, destfile)
nc_data <- nc_open('SM_D2010323_Map_SATSSS_data_1day.nc')
But when I use the same URL on my web browser, I can open the file without any problems with R.

Resources