I have a script that I am using to analyse some audio data in R studio. The script runs with no errors however when I execute it, it simply does nothing. I think it may be an issue with the system2 command. Any help is much appreciated, I am at a loss without even an error message to go by
Here is my code-
# Set the directory containing the files
directory <- "D://Alto//Audio 1//Day 2//"
# The directory to store the results
base_output_directory <- "D://Ecoacoustics//Output//indices_output//"
# Get a list of audio files inside the directory
# (Get-ChildItem is just like ls, or dir)
files <- list.files(directory, pattern = "*.wav", full.names = TRUE)
# iterate through each file
for(file in files) {
message("Processing ", file)
# get just the name of the file
file_name <- basename(file)
# make a folder for results
output_directory <- normalizePath(file.path(base_output_directory, file_name))
dir.create(output_directory, recursive = TRUE)
# prepare command
command <- sprintf('audio2csv "%s" "Towsey.Acoustic.yml" "%s" ', file, output_directory)
# finally, execute the command
system2('C://Ecoacoustics//AnalysisPrograms//AnalysisPrograms.exe//', command)
}
Related
I am trying to read all files in a specific sub-folder of the wd. I have been able to add a for loop successfully, but the loop only looks at files within the wd. I thought the command line:
directory <- 'folder.I.want.to.look.in'
would enable this but the script still only looks in the wd. However, the above command does help create a list of the correct files. I have included the script below that I have written but not sure what I need to modify to aim it at a specific sub-folder.
directory <- 'folder.I.want.to.look.in'
files <- list.files(path = directory)
out_file <- read_excel("file.to.be.used.in.output", col_names = TRUE)
for (filename in files){
show(filename)
filepath <- paste0(filename)
## Import data
data <- read_excel(filepath, skip = 8, col_names = TRUE)
data <- data[, -c(6:8)]
further script
}
The further script is irrelevant to this question and works fine. I just can't get the loop to look over each file in files from directory. Many thanks in advance
Set your base directory, and then use it to create a vector of all the files with list.files, e.g.:
base_dir <- 'path/to/my/working/directory'
all_files <- paste0(base_dir, list.files(base_dir, recursive = TRUE))
Then just loop over all_files. By default, list.files has recursive = FALSE, i.e., it will only get the files and directory names of the directory you specify, rather than going into each subfolder. Setting recursive = TRUE will return the full filepath excluding your base directory, which is why we concatenate it with base_dir.
I'm using R, via Rscript and H2O, but H2O is crashing. I want to review the logs, but the R tempdir that contains them seem to be removed when the R session ends (i.e. when the Rscript finishes).
Is it possible to tell R/Rscript not to remove the tmp folder it uses?
A work around for this would be to use on.exit to get the temporary files and save them in a different directory. An example function would be like this:
ranfunction <- function(){
#Get list of files in tempdir
on.exit(templist <- list.files(tempdir(), full.names = T,pattern = "^file") )
#create a new directory for files to go on exit
#use add = T to add to the on.exit call
on.exit(dir.create(dir1 <- file.path("G:","testdir")),add = T )
#for each file in templist assign it to the new directory
on.exit(
lapply(templist,function(x){
file.create(assign(x, tempfile(tmpdir = dir1) ))})
,add=T)
}
ranfunction()
One thing this function does not take into account is that if you rerun it - it will throw an error because the new directory dir1 already exits. You would have to delete dir1 before re-running the script.
I have a simple function in R that runs summary() via lapply() on many CSVs from one directory that I specify. The function is shown below:
# id -- the file name (i.e. 001.csv) so ID == 001.
# directory -- location of the CSV files (not my working directory)
# summarize -- boolean val if summary of the CSV to be output to console.
getMonitor <- function(id, dir, summarize = FALSE)
{
fl <- list.files(dir, pattern = "*.csv", full.names = FALSE)
fdl <- lapply(fl, read.csv)
dataSummary <- lapply(fdl, summary)
if(summarize == TRUE)
{ dataSummary[[id]] }
}
When I try to specify the directory and then pass it as a parameter to the function like so:
dir <- "C:\\Users\\ST\\My Documents\\R\\specdata"
funcVar <- getMonitor("001", dir, FALSE)
I receive the error:
Error in file(file, "rt") : cannot open the connection. In addition: Warning message:
In file(file, "rt") : cannot open file '001.csv': No such file or directory
Yet when I run the code below on its own:
fl <- list.files("C:\\Users\\ST\\My Documents\\R\\specdata",
pattern = "*.csv",
full.names = FALSE)
fl[1]
It find the directory I'm pointing to and fl[1] correctly outputs [1] "001.csv" which is the first file listed.
My question is what am I doing wrong when trying to pass this path variable as a parameter to my function. Is R incapable of handling a parameter this way? Is there something I'm just completely missing? I've tried searching around and am familiar with other programming languages so, frankly, I feel kind of stupid/defeated for getting stuck on this right now.
You're passing fl[1] directly to read.csv with the qualifying path. If, instead, you use full.names=TRUE you'll get the full path and your read.csv step will work properly. However, you'll have to do a little munge to make your if statement function again.
You could also expand on your lapply function to paste the directory and file name together:
fdl <- lapply(fl, function(x) read.csv(paste(dir, x, sep='\\')))
Or create this pasted full path in a separate line:
fl.qualified <- paste(dir, fl, sep='\\')
fdl <- lapply(fl.qualified, read.csv)
When you do the paste step, if you want to be really explicit, I would encourage a regex to make sure you don't have someone passing a directory with a trailing slash:
fl.qualified <- paste(gsub('\\\\$', '', dir), f1, sep='\')
or something along those lines.
I'm trying to do something I think should be straight forward enough, but so far I've been unable to figure it out (not surprisingly I'm a noob)...
I would like to be able to prompt a user for input file(s) in R. I've successfully used file.choose() to get a single file, but I would like to have the option of selecting more than one file at a time.
I'm trying to write a program that sucks in daily data files, with the same header and appends them into one large monthly file. I can do it in the console by importing the files individually, and then using rbind(file1, file2,...) but I need a script to automate the process. The number of files to append will not necessarily be constant between runs.
Thanks
Update: Here the code I came up that works for me, maybe it will be helpful to someone else as well
library (tcltk)
File.names <- tk_choose.files() #Prompts user for files to be combined
Num.Files <-NROW(File.names) # Gets number of files selected by user
# Create one large file by combining all files
Combined.file <- read.delim(File.names [1], header=TRUE, skip=2) #read in first file of list selected by user
for(i in 2:Num.Files){
temp <- read.delim(File.names [i], header=TRUE, skip=2) #temporary file reads in next file
Combined.file <-rbind(Combined.file, temp) #appends Combined file with the last file read in
i<-i+1
}
output.dir <- dirname(File.names [1]) #Finds directory of the files that were selected
setwd(output.dir) #Changes directory so output file is in same directory as input files
output <-readline(prompt = "Output Filename: ") #Prompts user for output file name
outfile.name <- paste(output, ".txt", sep="", collapse=NULL)
write.table(Combined.file, file= outfile.name, sep= "\t", col.names = TRUE, row.names=FALSE)` #write tab delimited text file in same dir that original files are in
Have you tried ?choose.files
Use a Windows file dialog to choose a list of zero or more files interactively.
If you are willing to type each file name, why not just loop over all the files like this:
filenames <- c("file1", "file2", "file3")
filecontents <- lapply(filenames, function(fname) {<insert code for reading file here>})
bigfile <- do.call(rbind, filecontents)
If your code must be interactive, you can use the readline function in a loop that will stop asking for more files when the user inputs an empty line:
getFilenames <- function() {
filenames <- list()
x <- readline("Filename: ")
while (x != "") {
filenames <- append(filenames, x)
x <- readline("Filename: ")
}
filenames
}
I have a R script to load multiple text files in a directory and save the data as compressed .rda. It looks like this,
#!/usr/bin/Rscript --vanilla
args <- commandArgs(TRUE)
## arg[1] is the folder name
outname <- paste(args[1], ".rda", sep="")
files <- list.files(path=args[1], pattern=".txt", full=TRUE)
tmp <- list()
if(file.exists(outname)){
message("found ", outname)
load(outname)
tmp <- get(args[1]) # previously read stuff
files <- setdiff(files, names(tmp))
}
if(is.null(files))
message("no new files") else {
## read the files into a list of matrices
results <- plyr::llply(files, read.table, .progress="text")
names(results) <- files
assign(args[1], c(tmp, results))
message("now saving... ", args[1])
save(list=args[1], file=outname)
}
message("all done!")
The files are quite large (15Mb each, 50 of them typically), so running this script takes up to a few minutes typically, a substantial part of which is taken writing the .rda results.
I often update the directory with new data files, therefore I would like to append them to the previously saved and compressed data. This is what I do above by checking if there's already an output file with that name. The last step is still pretty slow, saving the .rda file.
Is there a smarter way to go about this in some package, keeping a trace of which files have been read, and saving this faster?
I saw that knitr uses tools:::makeLazyLoadDB to save its cached computations, but this function is not documented so I'm not sure where it makes sense to use it.
For intermediate files that I need to read (or write) often, I use
save (..., compress = FALSE)
which speeds up things considerably.