Running a same script in sub folders - r

In my main folder i have many sub folders like AA,BB,CC,DD ...etc. and all folders have a common script named run_script.R and i want to run this script in every folder. folder can be any amount.
Its working abut running in first folder only ,but i wanted it to run in every folder.
also when i am using setwd(folder) then showing error
Error in setwd(folder) : cannot change working directory
data_folder <- "C:/Users/mosho/Desktop/New folder (2)/"
allfolders <- data.frame(Folders = list.dirs(path = data_folder, recursive = F, full.names = F))
r_scripts <- "run_script.R"
for (folder in allfolders$Folders) {
#setwd(folder)
message(folder)
source(paste0(data_folder,folder,"/",r_scripts))
}

You are on a right path, I did some minor tweaks to your script which will resolve the issue. The points missing in your scripts are;
the allfolders contains the folder name not the entire explicit path. To set the working directory you need to set give the explicit path, by only calling the folder name will result into error unless you existing working directory is contains that folder. Anyways, its best practice to work with full path names.
also to simplify setting up allfolders as list for iterator will make your life lot easier than a data frame
Below is my work-out;
I created some dummy folders (DIC01, DIC02, DIC03...) under path "C:\Users\XXXXXX\Documents\TEST MAIN", and placed code run_script.R inside each one. This run_script.R contains simple code print("Hello World !!")
Next I set initial working directory where to the path where all the folders present i.e. to path "C:\Users\XXXXXX\Documents\TEST MAIN". Next listed the folders/directories present within this path as a list instead of data frame. Next is for loop which iterate over list of folder names. Inside we reset the working directory by the folder name and source the R code.
data_folder <- "C:\\Users\\XXXXXX\\Documents\\TEST MAIN"
setwd(data_folder)
allfolders <- list.dirs(path = data_folder, recursive = F, full.names = F)
r_scripts <- "run_script.R"
for (folder in allfolders) {
print(folder)
setwd(paste0(data_folder,"\\",folder))
source(paste0(data_folder,"\\",folder,"\\",r_scripts))
}
The result I get after the execution is something like this. First the name of the directory and then execution result.
I hope this resolves you problem. If yes Like/Up vote the answer and let me know.

Related

Filepath error when specifying output location for write_dta

I would like to programmatically specify the location of my exports when using write.dta. I have my working directory set to a parent folder and my script is in a child folder called "Script". I want the export to be in a child folder called "Data".
setwd("~/Dropbox/Files")
file_output <- "survey"
path_out <- "./Data"
write_dta(df, paste0(file_output,".dta"), path = path_out, version = 12)
However, I keep getting an error message when R is trying to write. It says it's trying to write to the "Script" folder (where my script file is located in) rather than the desired "Data" folder.
Error: Failed to open '/Users/VancityPlanner/Dropbox/Files/Scripts' for writing
If I put the full path, I still get the same error message, whether it's a child folder or the parent folder (working directory) itself, so I don't think write permissions are an issue.
If I try not specifying the filepath, I have no error messages but it saves it to my working directory, which is not where I want it.
write_dta(df, paste0(file_output,".dta"), version = 12)
Below I show where my working directory is pointing and then I change the path of where I want to save the document in the path statement. Note it has the file path G:/ and the dataset name appended all together. I have a PC but no reason why this shouldn't work on a mac.
library(haven)
getwd()
#"C:/Users/myname/Documents"
write_dta(data = mtcars, path = "G:/mtcars.dta", version = 12)

Copy files from nested folders to new nested folders

I am trying to copy a large number of files from one folder to another. We need to restructure the folders, so there is a translation from the old folder path to a new one. The old folder structure is also nested.
Currently the code I have is not throwing any errors, but is returning false on executing the file.copy for all files.
ETA: When I copy a single file, it works.
allFilePaths <- list.files('./oldTopLevelFolder', recursive = TRUE)
testIds <- c(1:4)
otherTestIds <- c(5:8)
allNewFolders <- paste('newTopLevelFolder', testIds, 'aFolderName', otherTestIds, sep = '/')
lapply(allNewFolders, dir.create, recursive = TRUE)
file.copy(from=allFilePaths, to=allNewFolders,
copy.mode = TRUE)
file.copy can copy multiple files, but only to a single destination folder by the looks of it.
In order to copy a bunch of files into varying destination folders, the following will do the job, where allOldFilePaths is a column containing the old filepath where each file currently exists, and allNewFilePaths is a column containing the new folder path for each file.
# function to copy a single file
copySingleFile <- function(oldPath, newPath) {
file.copy(from=oldPath, to=newPath,
copy.mode = TRUE)
}
# copy each file to its new folder path
mapply(copySingleFile, allFilePathsWithRoot, allNewFilePaths)

Looping through folder and finding specific file in R

I am trying to loop through many folders in a directory, looking for a particular xml file buried in one of the folders. I would then like to save the location of that file and then run my code against that file (I will not include that code in this). What I am asking here is to loop through all the folders and then open the specific file.
For example:
My main folder would be: C:\Parsing
It has two folders named "folder1" and "folder2".
each folder has an xml file that I am interested in, lets say its called "needed.xml"
I would like to have a scrip that loops through the directory and finds those particular scripts.
Do you know how I could that in R.
Using list.files and greplyou could look recursively through all sub-folders
rootPath="C:\Parsing"
listFiles=list.files(rootPath,recursive=TRUE)
searchFileName="needed.xml"
presentFile=grepl(searchFileName,listFiles)
if(nchar(presentFile)) cat("File",searchFileName,"is present at", presentFile,"\n")
Is this what you're looking for?
require(XML)
fol <- list.files("C:/Parsing")
for (i in fol){
dir <- paste("C:/Parsing" , i, "/needed.xml", sep = "")
if(file.exists(dir) == T){
needed <- xmlToList(dir)
}
}
This will locate your xml file and read it into R as a list. I wasn't clear from your question if you wanted the output to be the data itself or just the directory location of your data which could then be supplied to another function/script. If you just want the location, remove the 'xmlToList' function.
I would do something like this (replace *.xml with your filename.xml if you want):
list.files(path = "C:\Parsing", pattern = "*.xml", recursive = TRUE, full.names = TRUE)
This will recursively look for files with extension .xml in the path C:\Parsing and return the full path of the matched files.

Dir.create function in R erases files in parent folders on the computer

For working purpose, I have made a new function which automatically tests whether a folder (with its name specified by its file path) exists or not:
make.dir <- function(fp) {
# If not existing, create a new one
if(!file.exists(fp)) {
make.dir(dirname(fp))
dir.create(fp, recursive = FALSE, showWarnings = FALSE)
} else {
# If existed, delete and replace with a new one
unlink(fp, recursive = FALSE)
dir.create(fp)
}
}
An example would be:
make.dir("D:/Work/R/Sample")
this is supposed to create a folder named "Sample" in the parental folder "R" if that folder did not exist, and replace the folder with a new one, if existed.
However, I have only been trying this with the latter case, i.e. replacing existing folders. Yesterday, I used this code to try to make a folder that did not pre-exist in "Sample" folder (say "Plots"). Instead of creating a folder named "Plots", it detected that the parent folder - "Sample" existed and started to delete every files contained in that folder. Afterwards, it proceeded to delete all of the files in the upper folder - "R" - as well. I had to cancel the code before it deletes my (D:) drive.
It works fine for replacing existing folders, but not with creating new folders. Anyone has any ideas on how to fix the function, or make a new one?
Thanks
As pointed out by #ytk in the comment above, the call for make.dir in the first if statement makes the function go wild. Removing that solved the problem. Thanks, ytk

How to loop over each file in multiple directories in R

I have an object called wanted.bam with the list of wanted file names for all the .bam (is the extension) files in three of my directories path1,path2,path3. I am looping over all these directories to search for the wanted files. What I am trying to do is look for wanted files by looping over each directory and implement a FUNCTION in each file. This loop works for all the matched file in the first directory, but as it progresses to another directory, it breaks giving an error:
Error in value[[3L]](cond) :
failed to open BamFile: file(s) do not exist:
'sort.bam'
my code:
bam.dir<- c("path1","path2","path3")
for (j in 1:length(bam.dir)){
all.bam.files <- list.files(bam.dir[j])
all.bam.files <- grep(wanted.names, all.bam.files, value=TRUE)
print(paste("The wanted number of bam files in this directory:", (length(all.bam.files))))
if(length(all.bam.files)==0){
next
}else{
setwd(bam.dir[j])
}
print(paste("The working directory number:",j,":",(getwd())))
## ****using another loop here for each file to implement a function*****
all.FAD<- {}
for(i in 1:length(all.bam.files)){
output<- FUNCTION(all.bam.files[i])
}
}
You probably don't want to be changing working directory like this. Instead, use the option in list.files, full.names=TRUE, to return the full path of your files. Then, you can just use read.csv, or whatever, on the full path name without need to change directory. Your code is failing because after you set directory, the relative path to the next directory is changed.
If you want to keep changing directories, just make sure you set the directory back to the base directory at the end of the loop.

Resources