How to rename files with a specific pattern in R? - r

There are some .fcs files in a data.000X format (where X = 1, 2, 3...) in a directory.
I want to rename every n file to the following format: exp.fcs (where exp is a text from a vector) if the file to be renamed is an .fcs file.
in other words: I want to rename files to exp.txt, where exp is a text and not a consecutive letter(s) i.e. F, cA, K, etc.
For example, from:
data.0001, data.0002, data.0003, data.0004, data.0005, data.0006...
to
textF_a.fcs, textF_b.fcs, textF_c.fcs, textVv_a.fcs, textVv_b.fcs, textVv_c.fcs ...
I tried to do it with file.rename(from, to) but failed as the arguments have different lengths (and I don't know what it means):
a <- list.files(path = ".", pattern = "data.*$")
b <- paste("data", 1:1180, ".fcs", sep = "")
file.rename(a, b)

Based on your comments, one issue is that your first file isn't named "data.001" - it's named "data.1". Use this:
b <- sprintf("data%.4d.fcs", seq(a))
It prepends up to 3 0s (since it seems you have 1000+ files, this may be better) to indices < 1000, so that all names have the same width. If you really just want to see things like "data.001", then use %.3d in the command.

Your code "works" on my machine ("works" in the sense that, when I created a set of files and followed your procedure, the renaming happened correctly). The error is likely that the number of files that you have (length(a)) is different from the number of new names that you give (length(b)). Post back if it turns out that these objects do have the same length.

As with the (very similar) question here, this function might be of use to you. I wrote it to allow regex find and replace in R. If you're on a mac it can detect and use the frontmost Finder window as a target. Also supports test runs, over-write control, and filtering large folders.
umxRenameFile <- function(baseFolder = "Finder", findStr = NA, replaceStr = NA, listPattern = NA, test = T, overwrite = F) {
# uppercase = u$1
if(baseFolder == "Finder"){
baseFolder = system(intern = T, "osascript -e 'tell application \"Finder\" to get the POSIX path of (target of front window as alias)'")
message("Using front-most Finder window:", baseFolder)
} else if(baseFolder == "") {
baseFolder = paste(dirname(file.choose(new = FALSE)), "/", sep = "") ## choose a directory
message("Using selected folder:", baseFolder)
}
if(is.na(listPattern)){
listPattern = findStr
}
a = list.files(baseFolder, pattern = listPattern)
message("found ", length(a), " possible files")
changed = 0
for (fn in a) {
findB = grepl(pattern = findStr, fn) # returns 1 if found
if(findB){
fnew = gsub(findStr, replace = replaceStr, fn) # replace all instances
if(test){
message("would change ", fn, " to ", fnew)
} else {
if((!overwrite) & file.exists(paste(baseFolder, fnew, sep = ""))){
message("renaming ", fn, "to", fnew, "failed as already exists. To overwrite set T")
} else {
file.rename(paste(baseFolder, fn, sep = ""), paste(baseFolder, fnew, sep = ""))
changed = changed + 1;
}
}
}else{
if(test){
# message(paste("bad file",fn))
}
}
}
message("changed ", changed)
}

Related

Automatic file numbering in ggsave as in png

In png(), the first argument is filename = "Rplot%03d.png" which causes files to be generated with ascending numbers. However, in ggsave, this doesn't work, the number always stays at the lowest number (Rplots001.png") and this file is always overwritten.
Looking at the code of the grDevices-functions (e.g. grDevices::png() it appears that the automatic naming happens in functions which are called by .External()
Is there already an implementation of this file naming functionality in R such that it is accessible outside of the grDevices functions?
Edit:
asked differently, is there a way to continue automatic numbering after shutting off and restarting a device? For example, in this code, the two later files overwrite the former ones:
png(width = 100)
plot(1:10)
plot(1:10)
dev.off()
png(width = 1000)
plot(1:10)
plot(1:10)
dev.off()
You can write a function to do this. For example, how about simply adding a time stamp. something like:
fname = function(basename = 'myfile', fileext = 'png'){
paste(basename, format(Sys.time(), " %b-%d-%Y %H-%M-%S."), fileext, sep="")
}
ggsave(fname())
Or, if you prefer sequential numbering, then something along the lines of
next_file = function(basename = 'myfile', fileext = 'png', filepath = '.'){
old.fnames = grep(paste0(basename,' \\d+\\.', fileext,'$'),
list.files(filepath), value = T)
lastnum = gsub(paste0(basename,' (\\d+)\\.', fileext,'$'), '\\1', old.fnames)
if (!length(lastnum)) {
lastnum = 1
} else {
lastnum = sort(as.integer(lastnum),T)[1] + 1L
}
return(paste0(basename, ' ', sprintf('%03i', lastnum), '.', fileext))
}
ggsave(next_file())

dir + file.copy returns "file does not exist" warnings due to non-english characters, how can I fix this?

I have this function in RStudio to synchronize 2 folders on windows.
p1 and p2 are paths;
fsync<-function(p1,p2){
A<-dir(p1,all.files = T,recursive = T,ignore.case = T, include.dirs = F,full.names = T);
B<-dir(p2,all.files = T,recursive = T,ignore.case = T, include.dirs = F,full.names = T);
d1<-setdiff(A,B);
d2<-setdiff(B,A);
if(length(d1)!=0) file.copy(d1,p2,overwrite = F,recursive = T)
if(length(d2)!=0) file.copy(d2,p1,overwrite = F,recursive = T)
}
When I run it, it worked, but also shows warnings saying "the file does not exist" or "no such file or directory" (I'm not really sure right now). I think it is only with files containning non-english characters (e.g. á, é, ...). How can I make dir() take the file names correctly?

R move whole folder to another directory

I would like to move the whole folder from one directory to another, this is my code,
folder_old_path = "C:/Users/abc/Downloads/managerA"
path_new = "C:/User/abc/Desktop/managerA"
current_files = list.files(folder_old_path, full.names = TRUE)
file.copy(from = current_files, to = path_new,
overwrite = recursive, recursive = FALSE, copy.mode = TRUE)
However, I am getting this error msg
Error in file.copy(from = current_files, to = path_new, overwrite = recursive, :
more 'from' files than 'to' files
any idea how to fix this? thank you so much for your help!
library(ff)
from <- "~/Path1/" #Current path of your folder
to <- "~/Path2/" #Path you want to move it.
path1 <- paste0(from,"NameOfMyFolder")
path2 <- paste0(to,"NameOfMyFolder")
file.move(path1,path2)
Try using this little code.
Easiest:
file.rename(folder_old_path, path_new)
If you want to check if path_new already exists you can expand the above to:
if (dir.exists(path_new) {
print(paste("already exists so recurively deleting path_new", path_new))
unlink(path_new, recursive = TRUE)
}
It appears as though the current_files = list.files(folder_old_path, full.names = TRUE) step is unnecessary. If my understanding of the R file documentation is correct, then you should be able to just use the following:
folder_old_path = "C:/Users/abc/Downloads/managerA"
path_new = "C:/User/abc/Desktop/managerA"
file.copy(from = folder_old_path, to = path_new,
overwrite = recursive, recursive = FALSE, copy.mode = TRUE)
If that doesn't work, then you'll have to create a new list of files (iterate over the current_files and replace folder_old_path with folder_new_path for each item in the list) and call file.copy on those:
folder_old_path = "C:/Users/abc/Downloads/managerA"
path_new = "C:/User/abc/Desktop/managerA"
current_files = list.files(folder_old_path, full.names = TRUE)
new_files = # replace folder_old_path with path_new for every file in current_files
file.copy(from = current_files, to = new_files,
overwrite = recursive, recursive = FALSE, copy.mode = TRUE)
... this all assumes (of course) that both folder_old_path and path_new exist and you have the correct permissions on them.
The linked page does contain a caveat/note about windows paths:
There is no guarantee that these functions will handle Windows
relative paths of the form d:path: try d:./path instead. In
particular, d: is not recognized as a directory. Nor are \\?\ prefixes
(and similar) supported.
On linux you should be able to simply:
1) make the OTHER_DIR if needed. If it is a subdirectory to OUTPUT_DIR then:
dir.create(file.path(OUTPUT_DIR, OTHER_DIR), showWarnings = FALSE)
setwd(file.path(OUTPUT_DIR, OTHER_DIR))
dir.create() will just print a warning if the directory exists. If you want to see the warning, just remove the showWarnings = FALSE.
If it is just another directory at the same level as OUTPUT_DIR then:
dir.create(OTHER_DIR)
2) Then move the file (e.g. if OTHER_DIR is at the same level as OUTPUT_DIR):
file.rename("C:/OUTPUT_DIR/file.csv", "C:/OTHER_DIR/file.csv")

How to save all images in a separate folder?

So, I am running the following code:
dirtyFolder = "Myfolder/test"
filenames = list.files(dirtyFolder, pattern="*.png")
for (f in filenames)
{
print(f)
imgX = readPNG(file.path(dirtyFolder, f))
x = data.table(img2vec(imgX), kmeansThreshold(imgX))
setnames(x, c("raw", "thresholded"))
yHat = predict(gbm.mod, newdata=x, n.trees = best.iter)
img = matrix(yHat, nrow(imgX), ncol(imgX))
img.dt=data.table(melt(img))
names.dt<-names(img.dt)
setnames(img.dt,names.dt[1],"X1")
setnames(img.dt,names.dt[2],"X2")
Numfile = gsub(".png", "", f, fixed=TRUE)
img.dt[,id:=paste(Numfile,X1,X2,sep="_")]
write.table(img.dt[,c("id","value"),with=FALSE], file = "submission.csv", sep = ",", col.names = (f == filenames[1]),row.names = FALSE,quote = FALSE,append=(f != filenames[1]))
# show a sample
if (f == "4.png")
{
writePNG(imgX, "train_101.png")
writePNG(img, "train_cleaned_101.png")
}
}
What it does is basically, takes as input images which have noise in them and removes noise from them. This is only the later part of the code which applies the algorithm prepared from a training dataset (not shown here).
Now, I am not able to figure out, how can I save the cleaned image for each of the images in the test folder. That is, I wish to save the cleaned image for each of the images in the folder and not just the 4.png image. The output image should have the name as 4_cleaned.png if the input image has the name 4.png and it should be saved in a separate folder in the same directory. That is, if input image has the name x.png, the output image should have the name x_cleaned.png and saved in a separate folder. How can I do it?
Tldr; I just want to save the variable named img for each of the filename as number_cleaned.png where number corresponds to the original file name. These new files should be saved in a separate folder.
Tldr; I just want to save the variable named img for each of the filename as number_cleaned.png where number corresponds to the original file name. These new files should be saved in a separate folder.
Alright, so construct the output filename using file.path and a function such as paste or sprintf:
folder_name = 'test'
output_filename_pattern = file.path(folder_name, '%s_cleaned.png')
remove_extension = function (filename)
gsub('\\.[^.]$', '', filename)
for (f in filenames) {
# … your code her …
new_filename = sprintf(output_filename_pattern, remove_extension(f))
# … save file here …
}

How do I rename files using R?

I have over 700 files in one folder named as:
files from number 1 to number9 are named for the first month:
water_200101_01.img
water_200101_09.img
files from number 10 to number30 are named:
water_200101_10.img
water_200101_30.img
And so on for the second month:
files from number 1 to number9 are named:
water_200102_01.img
water_200102_09.img
files from number 10 to number30 are named:
water_200102_10.img
water_200102_30.img
How can I rename them without making any changes to the files. just change the nams, for example
water_1
water_2
...till...
water_700
file.rename will rename files, and it can take a vector of both from and to names.
So something like:
file.rename(list.files(pattern="water_*.img"), paste0("water_", 1:700))
might work.
If care about the order specifically, you could either sort the list of files that currently exist, or if they follow a particular pattern, just create the vector of filenames directly (although I note that 700 is not a multiple of 30).
I will set aside the question, "why would you want to?" since you seem to be throwing away information in the filename, but presumably that information is contained elsewhere as well.
I wrote this for myself. It is fast, allows regex in find and replace, can ignore the file suffix, and can show what would happen in a "trial run" as well as protect against over-writing existing files.
If you are are on a mac, it can use applescript to pick out the current folder in the Finder as a target folder.
umx_rename_file <- function(findStr = "Finder", replaceStr = NA, baseFolder = "Finder", test = TRUE, ignoreSuffix = TRUE, listPattern = NULL, overwrite = FALSE) {
umx_check(!is.na(replaceStr), "stop", "Please set a replaceStr to the replacement string you desire.")
# ==============================
# = 1. Set folder to search in =
# ==============================
if(baseFolder == "Finder"){
baseFolder = system(intern = TRUE, "osascript -e 'tell application \"Finder\" to get the POSIX path of (target of front window as alias)'")
message("Using front-most Finder window:", baseFolder)
} else if(baseFolder == "") {
baseFolder = paste(dirname(file.choose(new = FALSE)), "/", sep = "") ## choose a directory
message("Using selected folder:", baseFolder)
}
# =================================================
# = 2. Find files matching listPattern or findStr =
# =================================================
a = list.files(baseFolder, pattern = listPattern)
message("found ", length(a), " possible files")
changed = 0
for (fn in a) {
if(grepl(pattern = findStr, fn, perl= TRUE)){
if(ignoreSuffix){
# pull suffix and baseName (without suffix)
baseName = sub(pattern = "(.*)(\\..*)$", x = fn, replacement = "\\1")
suffix = sub(pattern = "(.*)(\\..*)$", x = fn, replacement = "\\2")
fnew = gsub(findStr, replacement = replaceStr, x = baseName, perl= TRUE) # replace all instances
fnew = paste0(fnew, suffix)
} else {
fnew = gsub(findStr, replacement = replaceStr, x = fn, perl= TRUE) # replace all instances
}
if(test){
message(fn, " would be changed to: ", omxQuotes(fnew))
} else {
if((!overwrite) & file.exists(paste(baseFolder, fnew, sep = ""))){
message("renaming ", fn, "to", fnew, "failed as already exists. To overwrite set T")
} else {
file.rename(paste0(baseFolder, fn), paste0(baseFolder, fnew))
changed = changed + 1;
}
}
}else{
if(test){
# message(paste("bad file",fn))
}
}
}
if(test & changed==0){
message("set test = FALSE to actually change files.")
} else {
umx_msg(changed)
}
}
If you want to replace a certain section of the file name that matches a given pattern with another pattern. This is useful for renaming several files at once. For example, this code would take all of your files containing foo and replace foo with bob in the file names.
file.rename(list.files(pattern = "foo"), str_replace(list.files(pattern = "foo"),pattern = "foo", "bob"))
The following was my workaround for matching in sequence and changing all the filenames in a specified directory using simple base code.
old_files <- list.files(path = ".", pattern="water_*.img$")
# Create df for new files
new_files <- data.frame()
for(i in 1:length(old_files)){
new_files <- append(paste0(path = ".", substr(old_files[i], 1,6),"water_",i,".img"), new_files)
}
new_files <- as.character(new_files)
# Copy from old files to new files
file.rename(from = old_files), to = as.vector(new_files)

Resources