Copy files from nested folders to new nested folders - r

I am trying to copy a large number of files from one folder to another. We need to restructure the folders, so there is a translation from the old folder path to a new one. The old folder structure is also nested.
Currently the code I have is not throwing any errors, but is returning false on executing the file.copy for all files.
ETA: When I copy a single file, it works.
allFilePaths <- list.files('./oldTopLevelFolder', recursive = TRUE)
testIds <- c(1:4)
otherTestIds <- c(5:8)
allNewFolders <- paste('newTopLevelFolder', testIds, 'aFolderName', otherTestIds, sep = '/')
lapply(allNewFolders, dir.create, recursive = TRUE)
file.copy(from=allFilePaths, to=allNewFolders,
copy.mode = TRUE)

file.copy can copy multiple files, but only to a single destination folder by the looks of it.
In order to copy a bunch of files into varying destination folders, the following will do the job, where allOldFilePaths is a column containing the old filepath where each file currently exists, and allNewFilePaths is a column containing the new folder path for each file.
# function to copy a single file
copySingleFile <- function(oldPath, newPath) {
file.copy(from=oldPath, to=newPath,
copy.mode = TRUE)
}
# copy each file to its new folder path
mapply(copySingleFile, allFilePathsWithRoot, allNewFilePaths)

Related

Moving and copying multiple files

I will have list of source path & destination path in excel,
How can I move these files
source
destination
C:/users/desk/1/a.pdf
C:/users/desktop/2
C:/users/desk/1/b.pdf
C:/users/desktop/3
C:/users/desk/1/abb.pdf
C:/users/desktop/56
I need to copy a file from particular source to respective given destination.
To copy the files, you can use file.copy. This accepts a vector of single directory, or a vector of file paths as destinations, and copies the files to the new directory/paths.
As your destination column contains only directory paths, you need to specify full paths (including file names) for new files. To do this, you can use file.path and basename to concatenate the original file names (in source) to the new directories (destination).
df = data.frame(
source = c('C:/users/desk/1/a.pdf', 'C:/users/desk/1/b.pdf', 'C:/users/desk/1/abb.pdf'),
destination = c('C:/users/desktop/2', 'C:/users/desktop/3', 'C:/users/desktop/56')
)
file.copy(from = df$source, to = file.path(df$destination, basename(df$source)))
To move the files, you can use file.rename.
file.rename(from = df$source, to = file.path(df$destination, basename(df$source)))
Note 1: file.rename may only work when moving files between locations on the same drive. To move files across drives you could use file.copy followed by file.remove to remove the original files after copying. If doing this, you should be careful not to remove files if the copy operation fails, e.g.:
file.move <- function(from, to, ...) {
# copy files and store vector of success
cp <- file.copy(from = from, to = to, ...)
# remove those files that were successful
file.remove(from[cp])
# warn about unsuccessful files
if (any(!cp)) {
warning(
'The following files could not be moved:\n',
paste(from[!cp], collapse = '\n')
)
}
}
file.move(from = df$source, to = file.path(df$destination, basename(df$source)))
Note 2: This is all assuming that you have read in your excel data using one of read.csv or data.table::fread (for .csv files) or readxl::read_excel (for .xls or .xlsx files)

Read all files in specific folder in R

I am trying to read all files in a specific sub-folder of the wd. I have been able to add a for loop successfully, but the loop only looks at files within the wd. I thought the command line:
directory <- 'folder.I.want.to.look.in'
would enable this but the script still only looks in the wd. However, the above command does help create a list of the correct files. I have included the script below that I have written but not sure what I need to modify to aim it at a specific sub-folder.
directory <- 'folder.I.want.to.look.in'
files <- list.files(path = directory)
out_file <- read_excel("file.to.be.used.in.output", col_names = TRUE)
for (filename in files){
show(filename)
filepath <- paste0(filename)
## Import data
data <- read_excel(filepath, skip = 8, col_names = TRUE)
data <- data[, -c(6:8)]
further script
}
The further script is irrelevant to this question and works fine. I just can't get the loop to look over each file in files from directory. Many thanks in advance
Set your base directory, and then use it to create a vector of all the files with list.files, e.g.:
base_dir <- 'path/to/my/working/directory'
all_files <- paste0(base_dir, list.files(base_dir, recursive = TRUE))
Then just loop over all_files. By default, list.files has recursive = FALSE, i.e., it will only get the files and directory names of the directory you specify, rather than going into each subfolder. Setting recursive = TRUE will return the full filepath excluding your base directory, which is why we concatenate it with base_dir.

Copy files from folders and sub-folders to another folder and saving the structure of the folders

I have created a list of files by some conditions and I want to copy only the files from that list to a new folder and subfolders like in the origin folder.
The structure of the folders is year/month/day.
This is the code I tried:
from.dir <- "J:/Radar_data/Beit_Dagan/RAW/2018"
## I want only the files from the night
to.dir <- "J:/Radar_data/Beit_Dagan/night"
files <- list.files(path = from.dir, full.names = TRUE, recursive =
TRUE)
## night_files is a vector I created with the files I need - only during the night
for (f in night_files) file.copy(from = f, to = to.dir)
But I get all the files in one folder
part of my list look like this:
[1] "J:/Radar_data/Beit_Dagan/H5/2018/03/10/TLV180310142554.h5"
[2] "J:/Radar_data/Beit_Dagan/H5/2018/03/10/TLV180310142749.h5"
[3] "J:/Radar_data/Beit_Dagan/H5/2018/03/10/TLV180310143054.h5"
Is there a way to keep the structure of the folder and the subfolders when copying?
I want to get the same structure of year/month/day in the new "night" folder
You need to use the flag recursive = T inside the copy call, so you don't really need to loop inside the dir.
from = paste0(getwd(),"/output/","output_1")
to = paste0(getwd(),"/output/","output_1_copy")
file.copy(from, to, recursive = T)
Note that you need to create the /output_1_copy directory previously to the call. Yo can do it manually or using dir.create(...).
You just need:
file.copy(from = from.dir, to = to.dir,recursive=T)

Unzip Multiple files containing same name using R

I have a 105 zipped files in a folder. They all contain one csv file each with the same name i.e. 'EapTransactions_1'
Currently I am using the following code in R to extract all of them into a new folder :
library(plyr)
outDir<-"C:/Users/dhritul.gupta/Migration Files/Trial1/extract"
zipF=list.files(path = "C:/Users/dhritul.gupta/Migration Files/Trial1", pattern = "*.zip", full.names = TRUE)
ldply(.data = zipF, .fun = unzip, exdir = outDir)
The problem with this approach is that since all file names are the same every one of them get overwritten and only the last one is saved.
Is there anyway to save each one of them by renaming them or adding a prefix/suffix to the file names while extraction?
You may try using file.rename to add a unique number to the end of each file, before you make the call which uses unzip:
zipF <- list.files(path = "C:/Users/dhritul.gupta/Migration Files/Trial1",
pattern = "*.zip", full.names = TRUE)
file.rename(zipF, paste0("EapTransactions_", 1:105))
ldply(.data=zipF, .fun=unzip, exdir=outDir)
I tried to build something based on Tim's idea. It worked for me when I stored the files at a temporary location to rename the files. I then moved the renamed files to the final destination and deleted the temporary files.
TempoutDir <-"C:/Users/dhritul.gupta/Migration Files/Trial1/extract/Temp" # Define a temp location
setwd(TempoutDir) #setwd for rename/remove functions to work
for (i in 1:length(zipF))
{
unzip(zipF[i],exdir=TempoutDir,overwrite = FALSE)
#Files are overwritten because of same name. Give a new name to the file with a random number using runif and save them at the final location. Delete the files in temp folder
a <- c(list.files(TempoutDir)) #Vector with actual file name
b <- c(paste(runif(length(list.files(TempoutDir)), min=0, max=1000 ),as.character(list.files(TempoutDir))))
#Vector with an appended temp number in front of the file name
file.rename(a,b) # Rename the file in temp location
file.copy(list.files(TempoutDir),outDir) # Move file from temp location to main location
file.remove(list.files(TempoutDir)) # Delete files in Temp location
rm(a)
rm(b) #Delete vectors a,b from environment
}
You should have all the files moved to the desired folder with random numbers appended in front of the file names and nothing left in the temp folder

Copy multiple files from multiple folders to a single folder using R

Hey I want to ask how to copy multiple files from multiple folders to a single folders using R language
Assuming there are three folders:
desktop/folder_A/task/sub_task/
desktop/folder_B/task/sub_task/
desktop/folder_C/task/sub_task/
In each of the sub_task folder, there are multiple files. I want to copy all the files in the sub_task folders and paste them in a new folder (let's name this new folder as "all_sub_task") on desktop. Can anyone show me how to do it in R using the loop or apply function? Thanks in advance.
Here is an R solution.
# Manually enter the directories for the sub tasks
my_dirs <- c("desktop/folder_A/task/sub_task/",
"desktop/folder_B/task/sub_task/",
"desktop/folder_C/task/sub_task/")
# Alternatively, if you want to programmatically find each of the sub_task dirs
my_dirs <- list.files("desktop", pattern = "sub_task", recursive = TRUE, include.dirs = TRUE)
# Grab all files from the directories using list.files in sapply
files <- sapply(my_dirs, list.files, full.names = TRUE)
# Your output directory to copy files to
new_dir <- "all_sub_task"
# Make sure the directory exists
dir.create(new_dir, recursive = TRUE)
# Copy the files
for(file in files) {
# See ?file.copy for more options
file.copy(file, new_dir)
}
Edited to programmatically list sub_task directories.
This code should work. This function takes one directory -for example desktop/folder_A/task/sub_task/- and copies everything there to a second one. Of course you can use a loop or apply to use more than one directory at once, as the second value is fixed sapply(froms, copyEverything, to)
copyEverything <- function(from, to){
# We search all the files and directories
files <- list.files(from, r = T)
dirs <- list.dirs(from, r = T, f = F)
# We create the required directories
dir.create(to)
sapply(paste(to, dirs, sep = '/'), dir.create)
# And then we copy the files
file.copy(paste(from, files, sep = '/'), paste(to, files, sep = '/'))
}

Resources