I have a folder Tmin which contains 18 folders. Each of the 18 folders contains hundreds of file. I would like to create a program with R that allow to add the name of the folder files for each file. I do not want to rename each of the file with a different name, I only want to add the folder name at the beginning of the file name. I am new in R and in programming. I was not able to have a batch function that can repeat the operation for each folder. You can find attached two pictures, which show what I would like to obtain.
For example, the file called "name_date.tiff" contained in the folder "MACA_Miroc" will become "MACA_Miroc_name_date.tiff". Moreover, I would like to repeat the operation automatically for each folder. Thanks in advance for any help!
Wanted situation and organization of my folders and file
This ought to work:
mydir <- getwd()
primary_folder <- "C:/Users/Desktop/Test_Data/"
subfolders <- grep("*MACA*", list.dirs(primary_folder, full.names = T, recursive = F),
value = T)
renameFunc <- function(z){
fnames <- dir(recursive = F, pattern= ".tiff|.csv")
addname <- substr(z, nchar(primary_folder)+2, nchar(z))
lapply(fnames, function(current_name){
#Regex to get extension, may need to addd $ sign to signify end of file name
ptrn <- ".*\\.([a-zA-Z]{2,4})"
extension <- regmatches(current_name, regexec(ptrn, current_name))[[1]][2]
no_extension <- gsub(paste(".",extension, sep = ""), "", current_name)
new_name <- paste(gsub("_"," ", no_extension), " ", addname, ".", extension, sep = "")
file.rename(current_name, new_name)
lapply(subfolders, readFunc)
I think if you're not in the directory where you want to change file names, you must specify the full name. Changing the working directory was a quick way but you could use full names (using regular expressions to get the correct from and to values for file.rename()). I got some errors at one poing when I was not in the directory where I wanted to change the name.
I feel this allows more control over which folders you want to change the names in since incorrect operation can be very messy. You may also want to skip some file extensions or subfolders etc.
Your path folder
Extract files list
Extract folder name
Rename all files
file.rename(from = paste0(folder,files),to = paste0(folder,folder_name,"_",files))
I am trying to read all files in a specific sub-folder of the wd. I have been able to add a for loop successfully, but the loop only looks at files within the wd. I thought the command line:
directory <- 'folder.I.want.to.look.in'
would enable this but the script still only looks in the wd. However, the above command does help create a list of the correct files. I have included the script below that I have written but not sure what I need to modify to aim it at a specific sub-folder.
directory <- 'folder.I.want.to.look.in'
files <- list.files(path = directory)
out_file <- read_excel("file.to.be.used.in.output", col_names = TRUE)
for (filename in files){
filepath <- paste0(filename)
## Import data
data <- read_excel(filepath, skip = 8, col_names = TRUE)
data <- data[, -c(6:8)]
further script
The further script is irrelevant to this question and works fine. I just can't get the loop to look over each file in files from directory. Many thanks in advance
Set your base directory, and then use it to create a vector of all the files with list.files, e.g.:
base_dir <- 'path/to/my/working/directory'
all_files <- paste0(base_dir, list.files(base_dir, recursive = TRUE))
Then just loop over all_files. By default, list.files has recursive = FALSE, i.e., it will only get the files and directory names of the directory you specify, rather than going into each subfolder. Setting recursive = TRUE will return the full filepath excluding your base directory, which is why we concatenate it with base_dir.
I have recently learned to code with R and I sort of manage to handle the data within files but I can't get it to manipulate the files themselves. Here is my problem:
I'd like to open successively, in my working directory "Laurent/R", the 3 folders that are within it ("gene_1", "gene_2", "gene_3").
In each folder, I want one specific .csv file (the one containing the specific word "Cq") to be renamed as "gene_x_Cq" (and then to move these 3 renamed files in a new folder (is that necessary?)).
I want then to be able to successively open these 3 .csv files (with read.csv i suppose) to manipulate the data within them.
I've looked at different functions like list.file, unlist, file.rename but i'm sure they are appropriate and I can't figure out how to use them in my case.
Can anyone help ? (I use a Mac)
Here's a potential solution. If you don't understand something, just shout out and ask!
setwd("Your own file path/Laurent")
# list all .csv files
csvfiles <- list.files(recursive = T, pattern = "\\.csv")
# Pick out files that have cq in them, ensuring that you ignore uppercase/lowercase
cq.files <- csvfiles[str_detect(csvfiles, fixed("cq", ignore_case = T))]
# Get gene number for both files - using "2" here because gene folder is at the second level in the file path
gene.nb <- str_sub(word(cq.files, 2, 2, sep = "/"), 6, 6)
# create a new folder to place new files into
# This will copy files, not move them. To move them, use file.rename - but be careful, I'd try file.copy first.
cq.files <- file.copy(cq.files,
paste0("R/genefiles/gene_", gene.nb, "_", "Cq", ".csv"))
# Now to work with all files in the new folder
genefiles <- list.files("R/genefiles", full.names = T)
# This will bring in all data into one dataframe. If you want them brought in as separate dataframes,
# use something like gene1 <- read.csv("R/genefiles/gene_1_Cq.csv")
files <- map_dfr(genefiles, read.csv)
I have tried looking at File extension renaming in R and using the script without any luck. My question is very much the same.
I have a bunch of files with the a file extension that I want to change. I have used the following code but cannot get the last step to work.
I know similar questions have been asked before but I'm simply stuck and therefore reaching out anyway.
startingDir<-"/Users/anders/Documents/Juni 2019/DATA"
endDir<-"/Users/anders/Documents/Juni 2019/DATA/formatted"
#List over files in startingDir with the extension .zipwblibcurl that I want to replace
old_files<-list.files(startingDir,pattern = "\\.zipwblibcurl")
#Renaming the file extension and making a new list i R changing the file extension from .zipwblibcurl to .zip
new_files <- gsub(".zipwblibcurl", ".zip", old_files)
#Replacing the old files in the startingDir. Eventually I would like to move them to newDir. For simplicity I have just tried as in the other post without any luck:...
file.rename( old_files, new_files)
After running file.rename I get the output FALSE for every entry.
The full answer here, including comment from #StephaneLaurent: make sure that you have full.names = TRUE inside the list.files(); otherwise the path to the file will not be captured, just the file name.
Full working snippet:
old = list.files(startingDir,
pattern = "\\.zipwblibcurl",
full.names = TRUE) #
# replace the file names
new <- gsub(".zipwblibcurl", ".zip", old )
# Rename old files names to the new file names
file.rename(old, new)
Like #StéphaneLaurent said, it's most likely that R tries to look in the current working directory for the files and can't find them. You can correct this by adding
file.rename(paste(startingDir, old_files, sep = "/"), paste(newDir, new_files, sep = "/"))
I have a 105 zipped files in a folder. They all contain one csv file each with the same name i.e. 'EapTransactions_1'
Currently I am using the following code in R to extract all of them into a new folder :
outDir<-"C:/Users/dhritul.gupta/Migration Files/Trial1/extract"
zipF=list.files(path = "C:/Users/dhritul.gupta/Migration Files/Trial1", pattern = "*.zip", full.names = TRUE)
ldply(.data = zipF, .fun = unzip, exdir = outDir)
The problem with this approach is that since all file names are the same every one of them get overwritten and only the last one is saved.
Is there anyway to save each one of them by renaming them or adding a prefix/suffix to the file names while extraction?
You may try using file.rename to add a unique number to the end of each file, before you make the call which uses unzip:
zipF <- list.files(path = "C:/Users/dhritul.gupta/Migration Files/Trial1",
pattern = "*.zip", full.names = TRUE)
file.rename(zipF, paste0("EapTransactions_", 1:105))
ldply(.data=zipF, .fun=unzip, exdir=outDir)
I tried to build something based on Tim's idea. It worked for me when I stored the files at a temporary location to rename the files. I then moved the renamed files to the final destination and deleted the temporary files.
TempoutDir <-"C:/Users/dhritul.gupta/Migration Files/Trial1/extract/Temp" # Define a temp location
setwd(TempoutDir) #setwd for rename/remove functions to work
for (i in 1:length(zipF))
unzip(zipF[i],exdir=TempoutDir,overwrite = FALSE)
#Files are overwritten because of same name. Give a new name to the file with a random number using runif and save them at the final location. Delete the files in temp folder
a <- c(list.files(TempoutDir)) #Vector with actual file name
b <- c(paste(runif(length(list.files(TempoutDir)), min=0, max=1000 ),as.character(list.files(TempoutDir))))
#Vector with an appended temp number in front of the file name
file.rename(a,b) # Rename the file in temp location
file.copy(list.files(TempoutDir),outDir) # Move file from temp location to main location
file.remove(list.files(TempoutDir)) # Delete files in Temp location
rm(b) #Delete vectors a,b from environment
You should have all the files moved to the desired folder with random numbers appended in front of the file names and nothing left in the temp folder
Is there any way to automatically delete all files or folders with few R command lines?
I am aware of the unlink() or file.remove() functions, but for those you need to define a character vector with exactly all the names of the files you want to delete. I am looking more for something that lists all the files or folders within a specific path (e.g. 'C:/Temp') and then delete all files with a certain name (regardless of its extension).
Any help is very much appreciated!
Maybe you're just looking for a combination of file.remove and list.files? Maybe something like:
do.call(file.remove, list(list.files("C:/Temp", full.names = TRUE)))
And I guess you can filter the list of files down to those whose names match a certain pattern using grep or grepl, no?
For all files in a known path you can:
dir_to_clean <- tempdir() #or wherever
#create some junk to test it with
paste("test", 1:5, "txt", sep = ".")
#Now remove them (no need for messing about with do.call)
pattern = "^test\\.[0-9]\\.txt$",
full.names = TRUE
You can also use unlink as an alternative to file.remove.
Using a combination of dir and grep this isn't too bad. This could probably be turned into a function that also tells you which files are to be deleted and gives you a chance to abort if it's not what you expected.
# Which directory?
mydir <- "C:/Test"
# What phrase do you want contained in
# the files to be deleted?
deletephrase <- "deleteme"
# Look at directory
# Figure out which files should be deleted
id <- grep(deletephrase, dir(mydir))
# Get the full path of the files to be deleted
todelete <- dir(mydir, full.names = TRUE)[id]
To delete everything inside the folder, but keep the folder empty
unlink("path/*", recursive = T, force = T)
To delete everything inside the folder, and also delete the folder
unlink("path", recursive = T, force = T)
Use force = T, to overwrite any read-only/hidden/etc. issues.
I quite like here::here for finding my way through folders (especially if I'm switching between inline evaluation and knit versions of an Rmarkdown notebook)... yet another solution:
# Batch remove files
# Match files in chosen directory with specified regex
files <- dir(here::here("your_folder"), "your_pattern")
# Remove matched files
unlink(paste0(here::here("your_folder"), files))