I have a directory "space" containing 300 CSV files, and its path is "C://rstuff//space".
And I have a function:
myfunction <- function(my_dir, x, y){
}
I want to open some of the csv files, so I want to get the location of these files,and I use the argument 'my_dir' to indicate the location of the CSV files.
I want to use setwd(paste0("C://rstuff//", my_dir)) (thanks for Batanichek's comment), but I think my way is not good to set the path, if I don't know the path exactly, what should I do? Is there any good methods?
You can use list.files
setwd("C://rstuff//space")
my_files<-list.files(pattern = ".csv",
full.names = TRUE, recursive = TRUE, ignore.case = TRUE)
This finds all csv files in your working directory and give you the path starting from your working directory.
[1] "./csvs2/data_1-10.csv"
[2] "./csvs2/old/data_1001-1010.csv"
[3] "./overview/results.csv"
Then you can specify the ones you want to use. I for example give the important csv files a number after an "_" e.g "data_23". So you can exclude all non-important files with:
my_files<-my_files[-(which(grepl("_", my_files)==FALSE))]
Related
I need to mass-import some data for my R project. Following some guide, I wrote a simple for loop which goes like that:
for (for_variable in list.files(path = "./data", pattern = ".csv$")) {
temp <- read_csv(for_variable)
# Some data wranglig
database <- rbind(database, temp)
rm(temp)
}
The problem is that my data is in the data folder in my working directory, as I've specified in list.files(path = "./data"). The problem is that I can't use read_csv(for_variable) because I get an error:
'file_name.csv' does not exist in current working directory
And if I try to specify the path in read_csv, it doesn't understand what 'for_variable' is, it tries to find literal 'for_variable' file in the data folder. So how can I combine path and variable name in read_csv? Or is there any other way of solving the problem?
I would recommend reading this post as it is helpful for importing multiple csv files.
But to help with your specific question, your error is likely caused becauseo you need to pass the full path name for the files you want to import and that can be specified by using the full.names = TRUE argument in list.files(). Passing just the file name contained in for_variable to read_csv won't work.
list.files(path = "./data", full.names = TRUE, pattern = ".csv$")
The base R function list.files lists all the files in a given path.
The default
aa <- list.files(path = ".")
Returns a vector of the names of everything in ., e.g.:
[1] "dir1" "file1.R"
I want it just to return "file1.R".
A clunky solution is if I call instead
bb <- list.files(path = ".", include.dirs = FALSE, recursive = TRUE)
I get
[1] "dir1/file2.R" "file1.R"
So I can get what I want by calling
intersect(aa, bb)
[1] "file1.R"
But it seems silly to create two objects and intersect them when I feel like list.files can probably give me this directly, I just can't figure out how.
Do you know?
Using list.files(path = ".") displays the files in the folder where you are currently working.
You are correct. Using the list.files()function can provide additional information about the files in this folder. But getting that additional information using this function will require knowing something about regex.
Using the list.files() function with the metacharacter $ will return all of those .r files in this folder. If there are other files ending with the letter r, then those files will also be returned.
The following will return those .r files you want, and this should provide an answer your question.
list.files(pattern = "r$")
Is there any way to automatically delete all files or folders with few R command lines?
I am aware of the unlink() or file.remove() functions, but for those you need to define a character vector with exactly all the names of the files you want to delete. I am looking more for something that lists all the files or folders within a specific path (e.g. 'C:/Temp') and then delete all files with a certain name (regardless of its extension).
Any help is very much appreciated!
Maybe you're just looking for a combination of file.remove and list.files? Maybe something like:
do.call(file.remove, list(list.files("C:/Temp", full.names = TRUE)))
And I guess you can filter the list of files down to those whose names match a certain pattern using grep or grepl, no?
For all files in a known path you can:
unlink("path/*")
dir_to_clean <- tempdir() #or wherever
#create some junk to test it with
file.create(file.path(
dir_to_clean,
paste("test", 1:5, "txt", sep = ".")
))
#Now remove them (no need for messing about with do.call)
file.remove(dir(
dir_to_clean,
pattern = "^test\\.[0-9]\\.txt$",
full.names = TRUE
))
You can also use unlink as an alternative to file.remove.
Using a combination of dir and grep this isn't too bad. This could probably be turned into a function that also tells you which files are to be deleted and gives you a chance to abort if it's not what you expected.
# Which directory?
mydir <- "C:/Test"
# What phrase do you want contained in
# the files to be deleted?
deletephrase <- "deleteme"
# Look at directory
dir(mydir)
# Figure out which files should be deleted
id <- grep(deletephrase, dir(mydir))
# Get the full path of the files to be deleted
todelete <- dir(mydir, full.names = TRUE)[id]
# BALEETED
unlink(todelete)
To delete everything inside the folder, but keep the folder empty
unlink("path/*", recursive = T, force = T)
To delete everything inside the folder, and also delete the folder
unlink("path", recursive = T, force = T)
Use force = T, to overwrite any read-only/hidden/etc. issues.
I quite like here::here for finding my way through folders (especially if I'm switching between inline evaluation and knit versions of an Rmarkdown notebook)... yet another solution:
# Batch remove files
# Match files in chosen directory with specified regex
files <- dir(here::here("your_folder"), "your_pattern")
# Remove matched files
unlink(paste0(here::here("your_folder"), files))
I have a folder Tmin which contains 18 folders. Each of the 18 folders contains hundreds of file. I would like to create a program with R that allow to add the name of the folder files for each file. I do not want to rename each of the file with a different name, I only want to add the folder name at the beginning of the file name. I am new in R and in programming. I was not able to have a batch function that can repeat the operation for each folder. You can find attached two pictures, which show what I would like to obtain.
For example, the file called "name_date.tiff" contained in the folder "MACA_Miroc" will become "MACA_Miroc_name_date.tiff". Moreover, I would like to repeat the operation automatically for each folder. Thanks in advance for any help!
Wanted situation and organization of my folders and file
This ought to work:
mydir <- getwd()
primary_folder <- "C:/Users/Desktop/Test_Data/"
subfolders <- grep("*MACA*", list.dirs(primary_folder, full.names = T, recursive = F),
value = T)
renameFunc <- function(z){
setwd(z)
fnames <- dir(recursive = F, pattern= ".tiff|.csv")
addname <- substr(z, nchar(primary_folder)+2, nchar(z))
lapply(fnames, function(current_name){
#Regex to get extension, may need to addd $ sign to signify end of file name
ptrn <- ".*\\.([a-zA-Z]{2,4})"
extension <- regmatches(current_name, regexec(ptrn, current_name))[[1]][2]
no_extension <- gsub(paste(".",extension, sep = ""), "", current_name)
new_name <- paste(gsub("_"," ", no_extension), " ", addname, ".", extension, sep = "")
file.rename(current_name, new_name)
})
}
lapply(subfolders, readFunc)
setwd(mydir)
I think if you're not in the directory where you want to change file names, you must specify the full name. Changing the working directory was a quick way but you could use full names (using regular expressions to get the correct from and to values for file.rename()). I got some errors at one poing when I was not in the directory where I wanted to change the name.
I feel this allows more control over which folders you want to change the names in since incorrect operation can be very messy. You may also want to skip some file extensions or subfolders etc.
Your path folder
folder<-"C:/path/example/"
Extract files list
files<-list.files(folder)
Extract folder name
folder_name<-unlist(strsplit(folder,"/"))[length(unlist(strsplit(folder,"/")))]
Rename all files
file.rename(from = paste0(folder,files),to = paste0(folder,folder_name,"_",files))
I am trying to loop through many folders in a directory, looking for a particular xml file buried in one of the folders. I would then like to save the location of that file and then run my code against that file (I will not include that code in this). What I am asking here is to loop through all the folders and then open the specific file.
For example:
My main folder would be: C:\Parsing
It has two folders named "folder1" and "folder2".
each folder has an xml file that I am interested in, lets say its called "needed.xml"
I would like to have a scrip that loops through the directory and finds those particular scripts.
Do you know how I could that in R.
Using list.files and greplyou could look recursively through all sub-folders
rootPath="C:\Parsing"
listFiles=list.files(rootPath,recursive=TRUE)
searchFileName="needed.xml"
presentFile=grepl(searchFileName,listFiles)
if(nchar(presentFile)) cat("File",searchFileName,"is present at", presentFile,"\n")
Is this what you're looking for?
require(XML)
fol <- list.files("C:/Parsing")
for (i in fol){
dir <- paste("C:/Parsing" , i, "/needed.xml", sep = "")
if(file.exists(dir) == T){
needed <- xmlToList(dir)
}
}
This will locate your xml file and read it into R as a list. I wasn't clear from your question if you wanted the output to be the data itself or just the directory location of your data which could then be supplied to another function/script. If you just want the location, remove the 'xmlToList' function.
I would do something like this (replace *.xml with your filename.xml if you want):
list.files(path = "C:\Parsing", pattern = "*.xml", recursive = TRUE, full.names = TRUE)
This will recursively look for files with extension .xml in the path C:\Parsing and return the full path of the matched files.