I will have list of source path & destination path in excel,
How can I move these files
source
destination
C:/users/desk/1/a.pdf
C:/users/desktop/2
C:/users/desk/1/b.pdf
C:/users/desktop/3
C:/users/desk/1/abb.pdf
C:/users/desktop/56
I need to copy a file from particular source to respective given destination.
To copy the files, you can use file.copy. This accepts a vector of single directory, or a vector of file paths as destinations, and copies the files to the new directory/paths.
As your destination column contains only directory paths, you need to specify full paths (including file names) for new files. To do this, you can use file.path and basename to concatenate the original file names (in source) to the new directories (destination).
df = data.frame(
source = c('C:/users/desk/1/a.pdf', 'C:/users/desk/1/b.pdf', 'C:/users/desk/1/abb.pdf'),
destination = c('C:/users/desktop/2', 'C:/users/desktop/3', 'C:/users/desktop/56')
)
file.copy(from = df$source, to = file.path(df$destination, basename(df$source)))
To move the files, you can use file.rename.
file.rename(from = df$source, to = file.path(df$destination, basename(df$source)))
Note 1: file.rename may only work when moving files between locations on the same drive. To move files across drives you could use file.copy followed by file.remove to remove the original files after copying. If doing this, you should be careful not to remove files if the copy operation fails, e.g.:
file.move <- function(from, to, ...) {
# copy files and store vector of success
cp <- file.copy(from = from, to = to, ...)
# remove those files that were successful
file.remove(from[cp])
# warn about unsuccessful files
if (any(!cp)) {
warning(
'The following files could not be moved:\n',
paste(from[!cp], collapse = '\n')
)
}
}
file.move(from = df$source, to = file.path(df$destination, basename(df$source)))
Note 2: This is all assuming that you have read in your excel data using one of read.csv or data.table::fread (for .csv files) or readxl::read_excel (for .xls or .xlsx files)
Related
I have a task that requires me to use a specific column in a CSV spreadsheet that stores the file names, for example:
File Name
CA-001
WV-001
ma-001
My task is to move some files from folder 'source' to folder 'target'.
And I'm using this csv spreadsheet as a crosswalk to select any files with names that match with what's in the column 'File Name'. Then I'm asking R to copy from the source folder that contains not only these files but also other files that are not in this list(eg: CO-001, SC-001...). If it's helpful, all of the files are PDFs, so we don't worry about file type. I want only the files that have names match with what's in the csv spreadsheet. How can I do this?
I have some sample code below, but it still didn't execute successfully.
source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"
all.files <- list.files(path = source)
csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')
toCopy <- all.files[all.files %in% csvfile$Move]
file.copy(toCopy, target)
Thank you!
With the provided code, the selection of patterns you want to match will be in csvfile$File.Name.
I'm assuming the source directory is potentially very large. Instead of performing slow regular expressions to match substrings (while we know the exact filename), and/or getting a complete file listing (which is also slow), I will only seek if the exactly wanted filenames exist before copying them:
source <- "C:/Users/53038/MovePDF/Test_From"
target <- "C:/Users/53038/MovePDF/Test_To"
csvfile <- read.csv('C:/Users/53038/MovePDF/Master.csv')
# add .pdf suffix
toCopy <- paste0(csvfile$File.Name,'.pdf')
# add source directory path
toCopy <- file.path(source, toCopy)
# optional: extract only the existing files from toCopy. You can skip this step if you're sure they exist and/or you don't mind receiving errors
toCopy <- toCopy[file.exists(toCopy)]
# make it so
file.copy(toCopy, target, overwrite = T)
I would preferably keep the .pdf extension in the filename at all times, so also in the source CSV. There would be an issue on case-sensitive filesystems (almost all Linux installations, rarely macOS or Windows) if the extension is .PDF, .Pdf, etc.
I am trying to copy a large number of files from one folder to another. We need to restructure the folders, so there is a translation from the old folder path to a new one. The old folder structure is also nested.
Currently the code I have is not throwing any errors, but is returning false on executing the file.copy for all files.
ETA: When I copy a single file, it works.
allFilePaths <- list.files('./oldTopLevelFolder', recursive = TRUE)
testIds <- c(1:4)
otherTestIds <- c(5:8)
allNewFolders <- paste('newTopLevelFolder', testIds, 'aFolderName', otherTestIds, sep = '/')
lapply(allNewFolders, dir.create, recursive = TRUE)
file.copy(from=allFilePaths, to=allNewFolders,
copy.mode = TRUE)
file.copy can copy multiple files, but only to a single destination folder by the looks of it.
In order to copy a bunch of files into varying destination folders, the following will do the job, where allOldFilePaths is a column containing the old filepath where each file currently exists, and allNewFilePaths is a column containing the new folder path for each file.
# function to copy a single file
copySingleFile <- function(oldPath, newPath) {
file.copy(from=oldPath, to=newPath,
copy.mode = TRUE)
}
# copy each file to its new folder path
mapply(copySingleFile, allFilePathsWithRoot, allNewFilePaths)
I have a 105 zipped files in a folder. They all contain one csv file each with the same name i.e. 'EapTransactions_1'
Currently I am using the following code in R to extract all of them into a new folder :
library(plyr)
outDir<-"C:/Users/dhritul.gupta/Migration Files/Trial1/extract"
zipF=list.files(path = "C:/Users/dhritul.gupta/Migration Files/Trial1", pattern = "*.zip", full.names = TRUE)
ldply(.data = zipF, .fun = unzip, exdir = outDir)
The problem with this approach is that since all file names are the same every one of them get overwritten and only the last one is saved.
Is there anyway to save each one of them by renaming them or adding a prefix/suffix to the file names while extraction?
You may try using file.rename to add a unique number to the end of each file, before you make the call which uses unzip:
zipF <- list.files(path = "C:/Users/dhritul.gupta/Migration Files/Trial1",
pattern = "*.zip", full.names = TRUE)
file.rename(zipF, paste0("EapTransactions_", 1:105))
ldply(.data=zipF, .fun=unzip, exdir=outDir)
I tried to build something based on Tim's idea. It worked for me when I stored the files at a temporary location to rename the files. I then moved the renamed files to the final destination and deleted the temporary files.
TempoutDir <-"C:/Users/dhritul.gupta/Migration Files/Trial1/extract/Temp" # Define a temp location
setwd(TempoutDir) #setwd for rename/remove functions to work
for (i in 1:length(zipF))
{
unzip(zipF[i],exdir=TempoutDir,overwrite = FALSE)
#Files are overwritten because of same name. Give a new name to the file with a random number using runif and save them at the final location. Delete the files in temp folder
a <- c(list.files(TempoutDir)) #Vector with actual file name
b <- c(paste(runif(length(list.files(TempoutDir)), min=0, max=1000 ),as.character(list.files(TempoutDir))))
#Vector with an appended temp number in front of the file name
file.rename(a,b) # Rename the file in temp location
file.copy(list.files(TempoutDir),outDir) # Move file from temp location to main location
file.remove(list.files(TempoutDir)) # Delete files in Temp location
rm(a)
rm(b) #Delete vectors a,b from environment
}
You should have all the files moved to the desired folder with random numbers appended in front of the file names and nothing left in the temp folder
I have a folder Tmin which contains 18 folders. Each of the 18 folders contains hundreds of file. I would like to create a program with R that allow to add the name of the folder files for each file. I do not want to rename each of the file with a different name, I only want to add the folder name at the beginning of the file name. I am new in R and in programming. I was not able to have a batch function that can repeat the operation for each folder. You can find attached two pictures, which show what I would like to obtain.
For example, the file called "name_date.tiff" contained in the folder "MACA_Miroc" will become "MACA_Miroc_name_date.tiff". Moreover, I would like to repeat the operation automatically for each folder. Thanks in advance for any help!
Wanted situation and organization of my folders and file
This ought to work:
mydir <- getwd()
primary_folder <- "C:/Users/Desktop/Test_Data/"
subfolders <- grep("*MACA*", list.dirs(primary_folder, full.names = T, recursive = F),
value = T)
renameFunc <- function(z){
setwd(z)
fnames <- dir(recursive = F, pattern= ".tiff|.csv")
addname <- substr(z, nchar(primary_folder)+2, nchar(z))
lapply(fnames, function(current_name){
#Regex to get extension, may need to addd $ sign to signify end of file name
ptrn <- ".*\\.([a-zA-Z]{2,4})"
extension <- regmatches(current_name, regexec(ptrn, current_name))[[1]][2]
no_extension <- gsub(paste(".",extension, sep = ""), "", current_name)
new_name <- paste(gsub("_"," ", no_extension), " ", addname, ".", extension, sep = "")
file.rename(current_name, new_name)
})
}
lapply(subfolders, readFunc)
setwd(mydir)
I think if you're not in the directory where you want to change file names, you must specify the full name. Changing the working directory was a quick way but you could use full names (using regular expressions to get the correct from and to values for file.rename()). I got some errors at one poing when I was not in the directory where I wanted to change the name.
I feel this allows more control over which folders you want to change the names in since incorrect operation can be very messy. You may also want to skip some file extensions or subfolders etc.
Your path folder
folder<-"C:/path/example/"
Extract files list
files<-list.files(folder)
Extract folder name
folder_name<-unlist(strsplit(folder,"/"))[length(unlist(strsplit(folder,"/")))]
Rename all files
file.rename(from = paste0(folder,files),to = paste0(folder,folder_name,"_",files))
I have a large number of nested directories with .ZIP files containing .CSV files that I want to loop through in R, extract the contents using unzip(), and then read the csv files into R.
However, there are many cases (numbering thousands of files) where there are multiple .zip files in the same directory containing .csv files with identical file names. If I set the overwrite=FALSE argument in unzip(), it ignores all duplicated names after the first. What I want is for it to extract all files but add some suffix to the file name that will allow the duplicated files to be extracted to the same directory, so that I do not have to create even more nested subdirectories to hold the files.
Example:
Directory ~/zippedfiles contains:
archive1.zip (consists of foo.csv, bar.csv), archive2.zip (foo.csv, blah.csv)
Run the following:
unzip('~/zippedfiles/archive1.zip', exdir='~/zippedfiles', overwrite=FALSE)
unzip('~/zippedfiles/archive2.zip', exdir='~/zippedfiles', overwrite=FALSE)
The result is
bar.csv
blah.csv
foo.csv
The desired result is
bar.csv
blah.csv
foo.csv
foo(1).csv
Rather than renaming the duplicate file names, why not keep them unique by assigning a separate folder for each unzip action (just like your OS probably would). This way you don't have to worry about changing file names, and you end up with a single list referencing all unzipped folders:
setwd( '~/zippedfiles' )
# get a list of ".zip" files
ziplist <- list.files( pattern = ".zip" )
# start a fresh vector to fill
unzippedlist <- vector( mode = "character", length = 0L )
# for every ".zip" file we found...
for( zipfile in ziplist ) {
# decide on a name for an output folder
outfolder <- gsub( ".zip", "", zipfile )
# create the output folder
dir.create( outfolder )
# unzip into the new output folder
unzip( 'zipfile', exdir = outfolder, overwrite=FALSE )
# get a list of files just unzipped
newunzipped <- list.files( path = outfolder, full.names = T )
# add that new list of files to the complete list
unzippedlist <- c( unzippedlist, newunzipped )
}
The vector unzippedlist should contain all of your unzipped files, with every one being unique, not necessarily by file name, but by a combination of directory and filename. So you can pass it as a vector to capture all of your files.
A solution for you might be to use system()/system2() and then use one of the countless unix methods to archieve that.