use R to loop through subdirectories and copy files

use R to loop through subdirectories and copy files - r

I am trying to create a batch script in R to pre-process some data and one of the first steps I have to do is check to see if a file exists in a sub-directory and then (if it does) create a copy of it with a new name. I'm having trouble with the syntax.
This is my code:
##Define the subject directory path
sDIR = "/home/bsussman/Desktop/WORKSPACE"
#create data frame to loop through
##list of subject directories
subjects <-list.dirs(path = sDIR, full.names = TRUE, recursive = FALSE)
for (subj in 1:length(subjects)){
oldT1[[subj]] <- dir(subjects[subj], pattern=glob2rx("s*.nii"), full.names=TRUE)
T1[[subj]] <- paste(subjects[subj], pattern="/T1.nii",sep="")
if (file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))=FALSE{
file.copy(oldT1, T1)
}
}
It renames files in one subdirectory, but will not do loop through gives me these errors:
Error: unexpected '=' in:
"
if (file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))="
> file.copy(oldT1, T1)
[1] FALSE
> }
Error: unexpected '}' in " }"
> }
Error: unexpected '}' in "}"
I am not as much worried about the [1]FALSE message. But any ideas?
Thanks!!

It's just a problem with the syntax in the if statement. Try replacing this:
if (file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))=FALSE{
file.copy(oldT1, T1)
}
with this:
if (!file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))){
file.copy(oldT1, T1)
}

Related

readLines function not recognizing separating character "\t"

My input file contains many lines of tab-delineated information in a text file. Below would be a line from the text file:
100026 TGACTGCATGACGTACAC NM_006342.1 TACC3
My code is as follows:
constant_source <- 'constants.R'
source(constant_source)
source(classes_file)
processFile = function(filepath) {
con = file(filepath, "r")
while ( TRUE ) {
line = readLines(con, sep="\t")
print(line)
if (length(line) == 0 ) {
break
}
}
close(con)
}
The output, however, is as follows:
100026\tTGACTGCATGACGTACAC\tNM_006342.1\tTACC3
Why is the readLines function not respecting the separation parameter? I have been toying with this for a while and am stuck. Sorry about this; I just started learning R today. If it makes a difference, I am using RStudio.

Julia: Extract Zip files within a Zip file

I'm using Julia's ZipFile package to extract and process csv files. No problem, but when I encounter a zip file within the zip file, I'd like to process that as well, but am encountering an error.
Julia ZipFile docs are here: https://zipfilejl.readthedocs.io/en/latest/
Here's the code:
using ZipFile
using DataFrames
function process_zip(zip::ZipFile.ReadableFile)
if split(zip.name,".")[end] == "zip"
r = ZipFile.Reader(zip) #error: MethodError: no method matching seekend(::ZipFile.ReadableFile)
for f in r.files
process_zip(f)
end
end
if split(zip.name,".")[end] == "csv"
df = readtable(zip) #for now just read it into a dataframe
end
end
r = ZipFile.Reader("yourzipfilepathhere");
for f in r.files
process_zip(f)
end
close(r)
The call to ZipFile.Reader gives the error:
MethodError: no method matching seekend(::ZipFile.ReadableFile)
Closest candidates are:
seekend(::Base.Filesystem.File) at filesystem.jl:191
seekend(::IOStream) at iostream.jl:57
seekend(::Base.AbstractIOBuffer) at iobuffer.jl:178
...
Stacktrace:
[1] _find_enddiroffset(::ZipFile.ReadableFile) at /home/chuck/.julia/v0.6/ZipFile/src/ZipFile.jl:259
[2] ZipFile.Reader(::ZipFile.ReadableFile, ::Bool) at /home/chuck/.julia/v0.6/ZipFile/src/ZipFile.jl:104
[3] process_zip(::ZipFile.ReadableFile) at ./In[27]:7
[4] macro expansion at ./In[27]:18 [inlined]
[5] anonymous at ./<missing>:?
So it seems ZipFile package cannot process a zip file from a zip file as it cannot do a seekend on it.
Any ideas on how to do this?

A workaround is to read the zip file into an IOBuffer. ZipFile.Reader is able to process the IOBuffer. Here is the working code:
using ZipFile
using DataFrames
function process_zip(zip::ZipFile.ReadableFile)
if split(zip.name,".")[end] == "zip"
iobuffer = IOBuffer(readstring(zip))
r = ZipFile.Reader(iobuffer)
for f in r.files
process_zip(f)
end
end
if split(zip.name,".")[end] == "csv"
df = readtable(zip) #for now just read it into a dataframe
end
end
r = ZipFile.Reader("yourzipfilepathhere");
for f in r.files
process_zip(f)
end
close(r)

Matching filenames in loop

I want to loop through an array and match filenames to particular variables.
I am attempting to do so like this:
file.names = c("common", "08f13", "13f08")
for (f in file.names){
if grep("common", f) {
a=f
} else if grep("08f13", f){
b=f
} else
c=f
}
and if common is in the filename I want to assign it to the variable a and if 08 is in the filename assign it to b and so on. Based on the errors I am getting in r I think there is something wrong with the structure of my loop, or I am even using grep incorrectly.
My code returns this error:
Error: unexpected '}' in "}"

file.names = list.files(path, pattern=".prj")
for (f in file.names){
if(grepl("common", f)) {
a=f
} else if(grepl("08", f)) {
b=f
} else {
c=f
}
}
Mistakes:
Round brackets around if, else if blocks
grep returns 1 / 0 which are integers and grepl returns TRUE / FALSE

Using lapply to source multiple R scripts in sub-directories

These are the folders in my directory
128 128-1-32 16384 16384-1-36 4096-1 512 512-1-65 65536-1
128-1 128tbw1 16384-1 4096 4096-1-36 512-1 65536
Each of them has a7.R code that loads files from each folder and creates images.I want my script to enter each of the folders then
source('a7.R')
then exit that folder and repeat the process for all the folders.I am doing this now manually and it is really boring.Is this possible with R?
I have tried solution like this
#!/usr/bin/Rscript
lapply(list.files(full.names=TRUE, recursive = TRUE, pattern = "^a7\\.R$"), source)
milenko#milenko-desktop:~/jbirp/mt07$ Rscript s.R
list()
The coffeinejunky's solution is not working
#!/usr/bin/Rscript
foo <- function(directory) { setwd(directory); source(a7.R) }
do.call("foo", list(directory= 128 128-1-32 16384 16384-1-36 4096-1 512 512-1-65 65536-1 128-1 128tbw1 16384-1 4096 4096-1-36 512-1 65536))
source('n.R')
Error in source("n.R") : n.R:2:33: unexpected numeric constant
1: foo <- function(directory) { setwd(directory); source(a7.R) }
2: do.call("foo", c(directory= 128 128
If i change list like this
do.call("foo", list(directory= "./128" "./128-1" "./128-1-32" "./128tbw1" "./16384" "./16384-1" "./16384-1-36" "./4096" "./4096-1" "./4096-1-36" "./512" "./512-1" "./512-1-65" "./65536" "./65536-1"))
I got
Error in source("n.R") : n.R:2:40: unexpected string constant
1: foo <- function(directory) { setwd(directory); source(a7.R) }
2: do.call("foo", list(directory= "./128" "./128-1"
^
This is what I got when I list path
> list.dirs(path = ".", full.names = TRUE)
[1] "." "./128" "./128-1" "./128-1-32" "./128tbw1"
[6] "./16384" "./16384-1" "./16384-1-36" "./4096" "./4096-1"
[11] "./4096-1-36" "./512" "./512-1" "./512-1-65" "./65536"
[16] "./65536-1"
I need to change directory multiple times and perform the same operation in each of them.Is lapply good for this or not?

The following should work:
directories <- list.dirs(path=".", full.names = T)
# you need to make sure this contains the relevant directories
# otherwise you need to remove irrelevant directories
foo <- function(x) {
old <- setwd(x) # this stores the old directory and changes into the new one
source("a7.R")
setwd(old)
}
lapply(directories, foo)
Alternatively,
for(folder in directories) foo(folder)

This will source every a7.R file with the working directory temporarily set to the sourced file's folder.
a7files <- list.files(full.names=TRUE, recursive = TRUE, pattern = "^a7\\.R$")
sapply(a7files, source, chdir = TRUE)
From ?source
chdir logical; if TRUE and file is a pathname, the R working directory is temporarily changed to the directory containing file for evaluating.

Debugging user-defined function

For a dataset of "Baltimore homicides"
It is required to create a function that takes a string for example "shooting" and return an integer represents the count of victims of "shooting".
I wrote the following function but i receive errors
Error: unexpected '}' in " }"
Error: object 'counti' not found
I also cant figure out if the ==Null is correct
count <- function(cause = NULL) {
## Check that "cause" is non-NULL; else throw error
if cause==NULL
{
stop()
print("no cause provided")
}
## Read "homicides.txt" data file
homicides <- readLines("homicides.txt")
## Extract causes of death
i <- grep(cause, homicides) ##get indices of cause
counti <- lenghth(i) ##get count of indices
## Check that specific "cause" is allowed; else throw error
if counti=0
{
stop()
print("no such cause")
}
## Return integer containing count of homicides for that cause
return(counti)
}
this is my working function after edit, thanks guys
count <- function(cause = NULL) {
if(missing(cause) | is.null(cause)) stop("no cause provided")
homicides <- readLines("homicides.txt")
i=length(grep(cause, homicides))
if(i==0) stop("no cause found")
return(i)
}

You can simplify your function to 2 lines by doing this:
count <- function(cause = NULL, data) {
if(is.null(cause)) stop("no cause provided")
length(grep(cause, data))
}
data <- c("murder", "some other cause")
count("murder", data)
[1] 1
Note the following principles:
R has many features of a functional language. This means that each function should, as far as possible, depend only on the arguments you pass it.
When you have a bug in your code, simplify it to the shortest possible version, fix the bug, then build out from there.
Also, keep stop() for really fatal errors. Not finding a search string in your data isn't an error, it simply means the cause wasn't found. You don't want your code to stop. At most, issue a message() or a warning().

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

use R to loop through subdirectories and copy files - r

It's just a problem with the syntax in the if statement. Try replacing this: if (file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))=FALSE{ file.copy(oldT1, T1) } with this: if (!file.exists(paste(subjects[subj], pattern="/T1.nii",sep=""))){ file.copy(oldT1, T1) }

Related

readLines function not recognizing separating character "\t"

Julia: Extract Zip files within a Zip file

Matching filenames in loop

Using lapply to source multiple R scripts in sub-directories

Debugging user-defined function

Categories

Resources