Applying script in subfolders - os.walk

I cd into a folder and start python. I want to apply a script to fix filenames in a directory and in sub folders.
import os
for dirname, subdirs, files in os.walk('.'):
os.rename(file, file.replace('\r', '').replace('\n', '').replace(' ', '_')
print 'Processed ' + file.replace('\r', '').replace('\n', '')
I get error "AttributeError: 'list" object has no attribute 'replace'. Help, please?

os.walk returns a 3-tuple that includes the root directory of the script, a list of subdirectories, and a list of files. You unpacked the 3-tuple in the for-loop and you're calling replace on the list of files.
You may want something like this:
for dirname, subdirs, files in os.walk('.'):
for file in files:
os.rename(file, file.replace('\r', '').replace('\n', '').replace(' ', '_')
print 'Processed ' + file.replace('\r', '').replace('\n', '')
You want to iterate through the list of the files and do you "replacing" on those individual files.

Related

how to write a function dir_info(directory) which returns 3 values

(1) Number of files inside the directory (recursively)
(2) Number of directories inside the directory (recursively)
def dir_info(directory):
nfiles = how many files inside directory?
ndirs = how many sub directories inside directory?
return nfiles, ndirs

Find a given file recursively inside a directory

Find a given file recursively inside a dir. The code I tried is not showing any output, though I have a file C:\Users\anaveed\test\hoax\a.txt
Below the code
import glob
import os
os.chdir(r'C:\Users\anaveed\test')
for f in glob.iglob('a.txt', recursive=True):
print(f)
No output
Looks like you need.
import glob
for f in glob.iglob(r'C:\Users\anaveed\test\**\a.txt', recursive=True):
print(f)
This is another way of doing it:
import os
path = r'C:\Users\anaveed\test'
filename = 'a.txt'
for root, dirs, files in os.walk(path):
for name in files:
if name == filename:
print(os.path.join(root, name))
A couple of comments:
you do not need to use glob if you are not specifying wildcards, just use os.walk()
you do not need to move to a specific path to look for files therein, just save the path into a variable.
it would be even better to wrap this into a function (perhaps using a list comprehension).
the glob solution is typically faster.

Recursively search subdirectories to find the first instance of file

I have a complex dataset that is spread over 80 directories for each city (C). Each of these cities have multiple and unidentical subdirectories of varying depth. To clarify this means that for an example: city 1 can have 5 subdirectories a-e, where each subdirectory again can have multiple subdirectories. Now I need to find the first instance of a .txt file in each terminal subdirectory and apply a function to the txt file (logical function that is already written). There are no .txt files in the pre-terminal subdirectories.
lapply(list.dirs,function(x) {
if length(list.files(path=x,pattern=".txt"))==0 {
**apply function to .txt file}**
else {**lapply list.dirs etc---**}
However, I´m left with a neverending loop this way. How can this be done efficiently?
you may need something like this :
Treat_txt<-function(direct){
if(length(list.files(direct,pattern=".txt"))){
do what you need to do with the text file
} else {
dirs<-list.dirs(direct,full.names=T,recursive=F)
sapply(dirs,Treat_txt)
}
}
And then you can just call the function with the path of the "top" directory

Copy files from few directories

I am trying to create a code which copies all folders/files from an existing folder to another (the origin folder has files and more folders that contains files/folders...)
My idea was to do something like this:
files <- list.files (Dir.origen)
for (i in files)
{
if (!file.info (paste(Dir.origen, i, sep = "/"))$isdir)
file.copy (paste(Dir.origen, i, sep = "/"), Dir.dest)
else dir.create (paste(Dir.dest,i,sep = "/"))
}
and insert the same for loop in else statement, and more loops inside.
My question is if there is a way to copy an entire directory.
I am also interested in source this code every time I create a new project in RStudio.
As RStudio creates a new directory for an empty project my objective is to fill this directory with all content I need.
I found out an answer, it is easier then it seems:
Dir.origen2 <- gsub("/","\\\\", Dir.origen) # Directiories must use backslashes
Dir.dest2 <- gsub("/","\\\\", Dir.dest)
comando <- paste0 ("xcopy ", Dir.origen2, " ", Dir.dest2, " /e /i /y")
system(comando)
where /e is for copy all the directories (including empties), /y for don't ask overwriting of documents and /i to create a new directory if Dir.dest do not exists (I guess).

How can I read multiple files in R

I have so many file(around 600) with these names:
x2008_1_3.txt
x2008_1_4.txt
x2008_1_5.txt
x2008_1_6.txt
x2008_1_7.txt
x2008_1_8.txt
.
.
.
.
x2009_1_3.txt
x2009_1_4.txt
x2009_1_5.txt
x2009_1_6.txt
x2009_1_7.txt
x2009_1_8.txt
.
.
.
.
I try so many ways to inter them as my infile all of them togather in R. But i still cannot have them all. i also want to make the output'names have the same name as input. any suggestion?
You can set the files pattern to list.files to get a list of the files:
list.files(path,pattern="^x[0-9]{4}_1_[0-9][.]txt",full.names = TRUE)
Set recursive=TRUE if your files in different directories.
I am not the best at R but this may help it is a version of a script i use for something similar using CSV's
Set dir, remember to use double \
directory = "Location of files you want imported" #IE c:\\Folder1\\Folder2
files = list.files(path=directory,pattern = "[.]txt") #Make a list of files, assuming you want all files in that folder
for(i in 1:length(files)) # loop though all files and use assign to create data frames or replace with a different function like read.csv or append ect..
{
file = files[i]
assign(file,read.table(paste(directory,file,sep = "\\"),sep="\t"))
}
I hope this helps a little!

Resources