Download multiple excel files linked through urls in R - r

I have a list containing hundreds of URLs directly linking to .xlsx files for a download:
list <- c("https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=main.weeklyReport.Excel&web_report_id=980",
"https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=main.weeklyReport.Excel&web_report_id=981",
"https://ec.europa.eu/consumers/consumers_safety/safety_products/rapex/alerts/?event=main.weeklyReport.Excel&web_report_id=990")
To download everything in the list, I created a loop:
for (url in list) {
download.file(url, destfile = "Rapex-Publication.xlsx", mode="wb")
}
However, it only downloads the first file and not the rest. My guess is that the program is overwriting the same destfile. What would I have to do to circumvent this issue?

Try something along the lines of:
for (i in 1:length(list)) {
download.file(list[i], destfile = paste0("Rapex-Publication-", i, ".xlsx"), mode="wb")
}

Related

r what doesn't curl_download like about a filename

I want to download some files (NetCDF although I don't think that matters) from a website
and write them to a specified data directory on my hard drive. Some code that illustrates my problems follows
library(curl)
baseURL <- "http://gsweb1vh2.umd.edu/LUH2/LUH2_v2f/"
fileChoice <- "IMAGE_SSP1_RCP19/multiple-states_input4MIPs_landState_ScenarioMIP_UofMD-IMAGE-ssp119-2-1-f_gn_2015-2100.nc"
destDir <- paste0(getwd(), "/data-raw/")
url <- paste0(baseURL, fileChoice)
destfile <- paste0(destDir, "test.nc")
curl_download(url, destfile) # this one works
destfile <- paste0(destDir, fileChoice)
curl_download(url, destfile) #this one fails
The error message is
Error in curl_download(url, destfile) :
Failed to open file /Users/gcn/Documents/workspace/landuse/data-raw/IMAGE_SSP1_RCP19/multiple-states_input4MIPs_landState_ScenarioMIP_UofMD-IMAGE-ssp119-2-1-f_gn_2015-2100.nc.curltmp.
It turns out the curl_download internally adds .curltmp to destfile and then removes it. I can't figure out what is writing
It turns out that the problem is the fileChoice variable includes a new directory; IMAGE_SSP1_RCP19. Once I created the directory the process worked fine. I'm posting this because someone else might make the same mistake I did.

How to save text files from multiple URLs in a common csv file using a for loop

I'm using R, I've currently got a for loop to save text data from URLs in a csv file:
for(i in 1:9){
cancerdbdata <-
paste0("http://annotation.dbi.udel.edu/CancerDB/record_CD_0000", i, ".txt")
cancerdbdata1 <- download.file(cancerdbdata, destfile =
"CancerDrugDBdestfile.csv")
}
However, as this loops it does not sequentially download the data from each URL to the csv file and I am left with a csv file that only contains the information from the last URL. I've tried to find a way to add the data sequentially from each URL but cannot. Sorry if this has already been asked, I looked around but couldn't find anything that made sense to me. Thanks in advance for an answer or redirecting me to an answer!
The download.file has a mode parameter you can set to "a". This will make sure that the additional data will be appended to the file.
The following should do:
for(i in 1:9) {
cancerdbdata <- paste0("http://annotation.dbi.udel.edu/CancerDB/record_CD_0000",
i,
".txt")
cancerdbdata1 <- download.file(cancerdbdata,
destfile = "CancerDrugDBdestfile.csv",
mode = "a")
}
I hope this helps.

Use R to iteratively download all tiff files from shared Google Drive folder

I would like to use R to:
1) create a list of all tif files in a shared google drive folder
2) loop through list of files
3) save each file to local drive
I've tried RGoogleDocs and RGoogleData and both seem to have stopped development and neither support downloading tif files. There is also GoogleSheets, but again, it doesn't suit my needs. Does anyone know of a way to accomplish this task?
-cherrytree
Here's a part of my code (cannot share all of it) that gets a list of urls and makes a copy on hard drive:
if (Download == TRUE) {
urls = DataFrame$productimagepath
for (url in urls) {
newName <- paste ("Academy/",basename(url), sep =" ")
download.file(url, destfile = newName, mode = "wb")
}
}

R Downloading multiple file from FTP using Rcurl

I'm a new R. user.
I am trying to download 7.000 files(.nc format) from ftp server ( which I got from user and password). On the website, each file is a link to download. I would like to download all the files (.nc).
I thank anyone who can help me how to run those jobs in R. Just an example what I have tried to do using Rcurl and a loop and informs me: cannot download all files.
library(RCurl)
url<- "ftp://ftp.my.link.fr/1234/"
userpwd <- userpwd="user:password"
destination <- "/Users/ME/Documents"
filenames <- getURL(url, userpwd="user:password",
ftp.use.epsv = FALSE, dirlistonly = TRUE)
for(i in seq_along(url)){
download.file(url[i], destination[i], mode="wb")
}
how can I do that?
The first thing you'd see is that the files in your directory, ie the object filenames, would be listed as one long string. To obtain an object of all file names as a character vector, you may try:
files <- unlist(strsplit(filenames, '\n'))
From here on, it's simply a matter of looping through all the files in the directory. I recommend you use the curl package, not Rcurl, to download the files, as it's easier to supply auth info for every download request.
library(curl)
h <- new_handle()
handle_setopt(h, userpwd = "user:pwd")
and then
lapply(files, function(filename){
curl_download(paste(url, filename, sep = ""), destfile = filename, handle = h)
})

R download and read many excel files, automatically

I need to download a few hundred number of excel files and import them into R each day. Each one should be their own data-frame. I have a csv. file with all the adresses (the adresses remains static).
The csv. file looks like this:
http://www.www.somehomepage.com/chartserver/hometolotsoffiles%a
http://www.www.somehomepage.com/chartserver/hometolotsoffiles%b
http://www.www.somehomepage.com/chartserver/hometolotsoffiles%a0
http://www.www.somehomepage.com/chartserver/hometolotsoffiles%aa11
etc.....
I can do it with a single file like this:
library(XLConnect)
my.url <- "http://www.somehomepage.com/chartserver/hometolotsoffiles%a"
loc.download <- "C:/R/lotsofdata/" # each files probably needs to have their own name here?
download.file(my.url, loc.download, mode="wb")
df.import.x1 = readWorksheetFromFile("loc.download", sheet=2))
# This kind of import works on all the files, if you ran them individually
But I have no idea how to download each file, and place it separately in a folder, and then import them all into R as individual data frames.
It's hard to answer your question as you haven't provided a reproducible example and it isn't clear what you exactly want. Anyway, the code below should point you in the right direction.
You have a list of urls you want to visit:
urls = c("http://www/chartserver/hometolotsoffiles%a",
"http://www/chartserver/hometolotsoffiles%b")
in your example, you load this from a csv file
Next we download each file and put it in a separate directory (you mentioned that in your question
for(url in urls) {
split_url = strsplit(url, "/")[[1]]
##Extract final part of URL
dir = split_url[length(split_url)]
##Create a directory
dir.create(dir)
##Download the file
download.file(url, dir, mode="wb")
}
Then we loop over the directories and files and store the results in a list.
##Read in files
l = list(); i = 1
dirs = list.dirs("/data/", recursive=FALSE)
for(dir in dirs){
file = list.files(dir, full.names=TRUE)
##Do something?
##Perhaps store sheets as a list
l[[i]] = readWorksheetFromFile(file, sheet=2)
i = i + 1
}
We could of course combine steps two and three into a single loop. Or drop the loops and use sapply.

Resources