I have an excel file containing a the names of company and the downloadable links for their .pdf files . My aim is to create directories as per tyhe company name in the excel column and have the pdf file downloaded to the newly created directory.
Here is my code
##Set the working directory
txtsrc<-"C:\\FirstAid"
setwd(txtsrc)
##make a vector file names and links
pdflist <- read.xlsx("Final results_6thjuly.xlsx",1)
colnames(pdflist)
##Check if docs folder exists
if (dir.exists("FirstAid_docs")=="FALSE"){
dir.create("FirstAid_docs")
}
##Change the working directory
newfolder<-c("FirstAid_docs")
newpath<-file.path(txtsrc,newfolder)
setwd(newpath)
##Check the present working directory
getwd()
## Create directories and download files
for( i in 1:length(pdflist[,c("ci_CompanyName")])){
##First delete the existing directories
if(dir.exists(pdflist[,c("ci_CompanyName")][i])=="TRUE"){
unlink(pdflist[,c("ci_CompanyName")][i], recursive = TRUE)
}
##Create a new directory
directoryname<-pdflist[,c("ci_CompanyName")][i]
dir.create(directoryname,recursive = FALSE, mode = "0777")
##Get the downloadable links
##Link like :www.xyz.com thus need to add https to it
link<-pdflist[,c("DocLink")][i]
vallink<-c("https://")
##Need to remove quotes from link
newlink<-paste0(vallink,link)
newlink<-noquote(newlink)
##Set paths for the downloadble file
destfile<-file.path(txtsrc,newfolder,directoryname)
##Download the file
download.file(newlink,destfile,method="auto")
##Next record
i<-i+1
}
This is the error/results i get
> colnames(pdflist)
[1] "ci_CompanyID" "ci_CompanyName" "ProgramScore" "ID_DI" "DocLink"
> download.file(newlink,destfile,method="auto")
Error in download.file(newlink, destfile, method = "auto") :
cannot open destfile 'C:\Users\skrishnan\Desktop\HR needed\text analysis proj\pdf\FirstAid/FirstAid_docs/Buckeye Partners, LP', reason 'Permission denied'
Despite setting the chmod why do i get the error .
I am using CRAN RGui(64-bit) and the R version 3.5.0 on Windows 64-bit machine.
Any help is greatly appreciated.
destfile in download.file needs to be a specific file, not just a directory. For example,
'C:\Users\skrishnan\Desktop\HR needed\text analysis proj\pdf\FirstAid\FirstAid_docs\Buckeye Partners, LP\myFile.pdf'
The final working code:
> ##Set the working directory txtsrc<-"C:\\FirstAid"
> setwd(txtsrc)
>
> ##make a vector file names and links pdflist <- read.xlsx("Final results_6thjuly.xlsx",1) colnames(pdflist)
>
> ##Check if docs folder exists if (dir.exists("FirstAid_docs")=="FALSE"){ dir.create("FirstAid_docs") }
>
> ##Change the working directory newfolder<-c("FirstAid_docs") newpath<-file.path(txtsrc,newfolder) setwd(newpath)
>
> ##Check the present working directory getwd()
>
> ## Create directories and download files
> for( i in 1:length(pdflist[,c("ci_CompanyName")])){
>
> ##First delete the existing directories
> if(dir.exists(pdflist[,c("ci_CompanyName")][i])=="TRUE"){
> unlink(pdflist[,c("ci_CompanyName")][i], recursive = TRUE)
> }
>
> ##Create a new directory
> directoryname<-pdflist[,c("ci_CompanyName")][i]
> dir.create(directoryname,recursive = FALSE, mode = "0777")
>
>
> ##Get the downloadable links
> ##Link like :www.xyz.com thus need to add https to it
> link<-pdflist[,c("DocLink")][i]
> vallink<-c("https://")
>
> ##Need to remove quotes from link
> newlink<-paste0(vallink,link)
> newlink<-noquote(newlink)
>
> ##Set paths for the downloadble file
> neway<-file.path(newpath,directoryname)
> destfile<-paste(neway,"my.pdf",sep="/")
>
>
>
> ##Download the file
> download.file(newlink,destfile,method="auto")
>
> ##Next record
> i<-i+1
> }
Related
I have a script that has been running smoothly for months. The last line of code basically goes as follows:
saveWorkbook(Wb, 'address/filename.xlsx'), overwrite = TRUE)
I run this script weekly (Mondays, unimportant), so I go to run it this week and I'm now getting this error when I go to save this created workbook:
Warning message:
In file.append(to[okay], from[okay]) : write error during file append
The address for this file is on a shared drive for work, so one of my first thoughts was maybe there were some new permissions for the shared drive, since saving this on local drives seems okay. But, I can save csv files on the shared drive still (using data.table::fwrite).
I'm a bit at a loss here. I've updated R, RTools, and RStudio and all my packages.
Has anyone come across this, or a similar, issue before? I could possibly be looking for some more information concerning the "write error during file append". I'm actually creating a whole new file when I run this and not appending anything to an existing file. But, I haven't been able to find anything explaining situations that could cause this error.
I have the same output but only inside a blob container on azure
overwrite instruction fails
packageVersion("openxlsx")
[1] ‘4.2.5’
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx")
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx")
Warning message:
In file.append(to[okay], from[okay]) : write error during file append
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx",overwrite = T)
Warning message:
In file.append(to[okay], from[okay]) : write error during file append
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx",overwrite = F)
Error in saveWorkbook(wb, file = file, overwrite = overwrite) :
File already exists!
Same code on my desktop produces diferent output
> packageVersion("openxlsx")
[1] ‘4.2.5’
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx")
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx")
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx",overwrite = T)
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx",overwrite = F)
Error in saveWorkbook(wb, file = file, overwrite = overwrite) :
File already exists!
As we can't fix Azure I recomend delete first and the write
> filexls= "mtcars.xlsx"
> if (file.exists(filexls)) {
+ file.remove(filexls)
+ }
[1] TRUE
> openxlsx::write.xlsx(x = mtcars,file ="mtcars.xlsx",overwrite = T)
to get executing program directory
> pt=funr::get_script_path()
> pt
NULL
> source("yourfile.R", chdir = T)
this code is not giving executing directory. it gives only
> source("yourfile.R", chdir = T)
How to get directory name?
I am trying to download all the files inside FTP folder
temp <- tempfile()
destination <- "D:/test"
url <- "ftp://XX.XX.net/"
userpwd <- "USER:Password"
filenames <- getURL(url, userpwd = userpwd,ftp.use.epsv = FALSE,dirlistonly = TRUE)
filenames <- strsplit(filenames, "\r*\n")[[1]]
When I am printing "filenames" I am getting all the file names which are inside the FTP folder - correct output till here
[1] "2018-08-28-00.gz" "2018-08-28-01.gz"
[3] "2018-08-28-02.gz" "2018-08-28-03.gz"
[5] "2018-08-28-04.gz" "2018-08-28-05.gz"
[7] "2018-08-28-08.gz" "2018-08-28-09.gz"
[9] "2018-08-28-10.gz" "2018-08-28-11.gz"
[11] "2018-08-28-12.gz" "2018-08-28-13.gz"
[13] "2018-08-28-14.gz" "2018-08-28-15.gz"
[15] "2018-08-28-16.gz" "2018-08-28-17.gz"
[17] "2018-08-28-18.gz" "2018-08-28-23.gz"
for ( i in filenames ) {
download.file(paste0(url,i), paste0(destination,i), mode="w")
}
I got this error
trying URL 'ftp://XXX.net/2018-08-28-00.gz'
Error in download.file(paste0(url, i), paste0(destination, i), mode = "w") :
cannot open URL 'ftp://XXX.net/2018-08-28-00.gz'
In addition: Warning message:
In download.file(paste0(url, i), paste0(destination, i), mode = "w") :
InternetOpenUrl failed: 'The login request was denied'
I modified the code to
for ( i in filenames )
{
#download.file(paste0(url,i), paste0(destination,i), mode="w")
download.file(getURL(paste(url,filenames[i],sep=""), userpwd =
"USER:PASSWORD"), paste0(destination,i), mode="w")
}
After that, I got this error
Error in function (type, msg, asError = TRUE) : RETR response: 550
Without a minimal, complete, and verifiable example it is a challenge to directly replicate your problem. Assuming the file names don't include the URL, you'll need to combine them to access the files.
download.file() requires a file to be read, an output file, as well as additional flags regarding whether you want a binary download or not.
For example, I have data from Alberto Barradas' Pokémon Stats kaggle.com data set stored on my Github site. To download some of the files to the test subdirectory of my R Working Directory, I can use the following code:
filenames <- c("gen01.csv","gen02.csv","gen03.csv")
fileLocation <- "https://raw.githubusercontent.com/lgreski/pokemonData/master/"
# use ./ for subdirectory of current directory, end with / to work with paste0()
destination <- "./test/"
# note that these are character files, so use mode="w"
for (i in filenames){
download.file(paste0(fileLocation,i),
paste0(destination,i),
mode="w")
}
...and the output:
The paste0() function concatenates text without spaces, which allows the code to generate a fully qualified path name for the url of each source file, as well as the subdirectory where the destination file will be stored.
To illustrate what's happening with paste0() in the for() loop, we can use message() to print to the R console.
> # illustrate what paste0() does
> for (i in filenames){
+ message(paste("Source is: ",paste0(fileLocation,i)))
+ message(paste("Destination is:",paste0(destination,i)))
+ }
Source is: https://raw.githubusercontent.com/lgreski/pokemonData/master/gen01.csv
Destination is: ./test/gen01.csv
Source is: https://raw.githubusercontent.com/lgreski/pokemonData/master/gen02.csv
Destination is: ./test/gen02.csv
Source is: https://raw.githubusercontent.com/lgreski/pokemonData/master/gen03.csv
Destination is: ./test/gen03.csv
>
I am trying to extract(unzip) folder (namely "pakistan.zip" which contains 5 files Pak_admin0.shp, Pak_admin0.shx, Pak_admin0.dbf, Pak_admin0.prj, Pak_admin0.qpj) and copying the files of .shp, .shx, .dbf files from that folder to destination folder using Rstudio 0.99.451 version with the following codes:
for(j in list(".shp", ".shx", ".dbf"))
{
fname <- unzip(file=paste("pakistan", j, sep=""), zipfile= "pakistan.zip")
file.copy(fname, paste("./pakistan", j, sep="/"), overwrite=TRUE)
}
unlink("pakistan.zip")
but it gives me following error
Warning messages:
1: In unzip(file = paste("zupanije", j, sep = ""), zipfile = "pakistan.zip") : requested file not found in the zip file
2: In unzip(file = paste("zupanije", j, sep = ""), zipfile = "pakistan.zip") : requested file not found in the zip file
3: In unzip(file = paste("zupanije", j, sep = ""), zipfile = "pakistan.zip") : requested file not found in the zip file
Please provide any possible solution to deal with this error.
These are actual codes which I have found but zip.file.extract function is no longer part of R:
for(j in list(".shp", ".shx", ".dbf")){
fname <- zip.file.extract(file=paste("zupanije", j, sep=""),
zipname="zupanije.zip")
file.copy(fname, paste("./zupanije", j, sep=""), overwrite=TRUE)
}
unlink("zupanije.zip")
I want to automate the structure of downloading the shape file from website and unzip it and place into another folder then will display it using maptools library using readShapePoly() function.
Your code works for me for a zip file that contains those files. The error suggests those files are not contained in the zip file. Since you say you are trying to extract a "directory" perhaps they are in a subdirectory in the zipfile? For example, if I put the files in a "temp" directory and then create a zip file of that directory, I must add the directory to the file path, like this:
f <- "test.zip"
for(j in list(".shp", ".shx", ".dbf"))
{
# note "pakistan" directory added to path
# unzip pakistan/zupanije.shp (or .shx or .dbf) out of test.zip
fname <- unzip(file=paste("pakistan/zupanije", j, sep=""), zipfile= f)
#copy extracted file to destination directory
file.copy(fname, paste("./destination", j, sep="/"), overwrite=TRUE)
}
If you are in a Linux like environment, you could try the following command to inspect the zip file and ensure it contains what you think it contains and at the path you expect:
unzip -vl pakistan.zip
By the way, your code will output the file "./pakistan/.dbf", "./pakistan/.shx" and "./pakistan/.shp". Is that what you want? Or do you perhaps want "pakistan.shx", etc. in which case this change is needed:
-file.copy(fname, paste("./pakistan", j, sep="/"), overwrite=TRUE)
+file.copy(fname, paste("./pakistan", j, sep=""), overwrite=TRUE)
My current directory is c:/users/akshay/Documents
But all my data is in the directory "specdata" whose path address is c:/users/akshay/Documents/specdata
when i type these commands separately in console it works successfully.
path <- "C:/Users/akshay/Documents"
directory <- "specdata"
setwd(paste(path, directory, sep="/", collapse=NULL))
But when i use it in function like this it wont change my working directory.
pollutantmean <- function(directory){
directory <- character(1)
path <- character(1)
path <- "C:/Users/akshay/Documents"
setwd(paste(path, directory, sep="/", collapse=NULL))
}
But when i pass
>pollutantmean("specdata")
It wont change my working directory why is it so?
what is the problem?
Maybe try returning the paste. Also, you don't need the character() functions.
pollutantmean <- function(directory){
path <- "C:/Users/akshay/Documents"
return(paste(path, directory, sep="/", collapse=NULL))
}
pollutantmean("specdata")
Output:
> pollutantmean("test")
[1] "C:/Users/akshay/Documents/test"
Change directory:
pollutantmean<-function(directory){ + path<-"C:/Users/akshay/Documents" + setwd(paste(path,directory,sep="/",collapse=NULL)) + }