Looping through nc files in R - r

Good morning everyone,
I am currently using the code written by Antonio Olinto Avila-da-Silva on this link: https://oceancolor.gsfc.nasa.gov/forum/oceancolor/topic_show.pl?tid=5954
It allows me to extract data of type sst/chlor_a from nc file. It uses a loop to create an excel file with all the data. Unfortunately, I noticed that the function only takes the first data file in the loop. Thus, I find myself with 20 times the same data in a row in my excel file.
Does anyone have a solution to make this loop work properly?

I would first check out that these two lines contain all the files you are expecting:
(f <- list.files(".", pattern="*.L3m_MO_SST_sst_9km.nc",full.names=F))
(lf<-length(f))
And then there's a bug in the for-loop. This line:
data<-nc_open(f)
Needs to reference the iterator i, so change it to something like this:
data<-nc_open(f[[i]])
It appears both scripts have this same bug.

Related

Pulling out specific files in a directory based on unique file names and reading them in with read_wav in R

I have >1000 audio files in one directory that I have been tagging for bird calls and entering the data into excel.
Once I read in this data, I filtered out covariates of interest and made separate data frames. Basically, I have about 45 files I want to analyse separately with read_wav.
I'm not sure how to create a for loop to look into a certain directory '/all_SMP' and pull out these 45 files from a list of >1000.
I've created a list "list.clean", however my current forloop only lists every file in that folder (>1000 files)
for(i in c(list.clean)){
raw.path <- paste0("../02_Working/SMP_SM4/SMP_15sec/all_AR_SMP")
wav.list <- list.files(path="../02_Working/SMP_SM4/SMP_15sec/all_AR_SMP",
pattern="*.wav",
recursive=TRUE)
I'm quite a novice with R as I'm sure you can tell.
I want to read_wav on the 45 files and use the 'analyze' function on each file
audio1 <- analyze(sample1,samplingRate=24000, cutFreq=c(800,8000))
Hope this makes sense!
Cheers
In list.files the parameter pattern is a regular expression.
So, for your case should be pattern=".*\\.wav$" (and it's case sensitive).
Alternatively and easier, you can use fs::dir_ls(glob = "*.wav").

Remove everything after an empty row on a list in R

I have a quick question that I cannot figure out. I am reading some results from an output file using the code below and stored as a list in R that can be seen in the picture. I want to delete all of the information after an empty row, in other words, it would be everything after line 42:
Does anybody know anything that I could use? I tried using gsub was I was not very successful.
Thanks for all of the help I am new to programming in R. Again any help is very much appreciated.
LoadFFA <- function(filename, folder.out, TYPE = "PeakFQ_17C",
colStandard = TRUE){ # standardize column output names
require(data.table)
if(grepl("PEAKFQSA",TYPE)){ # PeakfqSA Bulleting 17C analysis
text.list<-lapply(fileinput,readLines)
skip.rows<-sapply(text.list, grep, pattern = '^Ann. Exc. Prob.\\s+EMA Est.')-1
PFA <- lapply(seq_along(text.list),function(i) read.delim(fileinput[i],skip=skip.rows[i],sep="\n",stringsAsFactors = TRUE,blank.lines.skip = FALSE))
}
EDIT
I don't know if I could upload directly so here is the google drive link.
Also, here is the command to run the function LoadFFA("03606500peaks.out","D:/Documents/hydraulic.failures","PEAKFQSA"). The screenshot is the result using print(PFA).
The reason why I am using a loop is because I am reading multiple files (output files) and they have a lot of data, multiple lenghts, and I am reading the data beginning Ann.Exc.Prob. and as per the screenshot provided I would like to end after line 42 (after a full empty row). I hope that clears some confusion.
Basically read the output files, start reading on "Ann.Exc.Prob" and end until the end of that data (line 42 for this particular file). I am using a function because I am running several times.
Again, sorry for the trouble. Thank you for your time and I appreciate your patience.
https://drive.google.com/file/d/1PGbGWIHFj7IQRevTAEfqqA9Okg4fz7Mg/view?usp=sharing

How to import multiple matlab files into R (Using package R.Matlab)

Thank you in advance for your're help. I am using R to analyse some data that is initially created in Matlab. I am using the package "R.Matlab" and it is fantastic for 1 file, but I am struggling to import multiple files.
The working script for a single file is as follows...
install.packages("R.matlab")
library(R.matlab)
x<-("folder_of_files")
path <- system.file("/home/ashley/Desktop/Save/2D Stream", package="R.matlab")
pathname <- file.path(x, "Test0000.mat")
data1 <- readMat(pathname)
And this works fantastic. The format of my files is 'Name_0000.mat' where between files the name is a constant and the 4 digits increase, but not necesserally by 1.
My attempt to load multiple files at once was along these lines...
for (i in 1:length(temp))
data1<-list()
{data1[[i]] <- readMat((get(paste(temp[i]))))}
And also in multiple other ways that included and excluded path and pathname from the loop, all of which give me the same error:
Error in get(paste(temp[i])) :
object 'Test0825.mat' not found
Where 0825 is my final file name. If you change the length of the loop it is always just the name of the final one.
I think the issue is that when it pastes the name it looks for that object, which as of yet does not exist so I need to have the pasted text in speach marks, yet I dont know how to do that.
Sorry this was such a long post....Many thanks

Writing results from multiple runs of an R script to a single output csv file

I have written an R script that is to be used as part of a shell script based pipeline which will feed dozens of files containing genetic sequence data to the R script one after the other (using args[]).
I am having trouble finding a way to write the results of each run of this script to a single results file. I thought that the easiest way to do this might be to create an empty results.csv table and then ask the script to write to the next row of this file each time it is run (saves the problem of the script writing straight over the file on each run). In this vein a friend helped me out with the following code:
x<-readLines("results.csv")
if(x[[1]]==""){x[[1]]<-paste("meancoscore", "meanboot", "CIres", "RIres", "RC", "nodecount", sep= ",")}
x[[length(x)+1]]<-paste(meancoscore, meanboot, CIres, RIres, RC, nodecount, sep = ",")
x<-data.frame(x)
write.table(x,"results.csv", row.names = F, col.names = F, sep = ",")
In the above code "meancoscore", "meanboot", "CIres", "RIres", "RC", and "nodecount" are first used as a header if the data frame has nothing on the first row.
Following this the results (objects: meancoscore, meanboot, CIres, RIres, RC and nodecount are written in the columns corresponding with their headers. The idea here is that if you run the R script again with different source files it should simply write the results to the next line in the results.csv file.
However, the following is seen in the results.csv file after three runs of this code with different input files:
"\""\\""meancoscore,meanboot,CIres,RIres,RC,nodecount\\""\""
""\""\\""0.000,76.3247863247863,0.721002252252252,0.983235214508053,0.708914804154032,117\\""\""
""\""0.845,77.6923076923077,0.723259762308998,0.983410513459875,0.711261254217159,117\""
""0.85,77.4358974358974,0.728886344116805,0.983878381369061,0.717135516451654,117"
Where my desired result would be the following:
meancoscore,meanboot,CIres,RIres,RC,nodecount
0.000,76.3247863247863,0.721002252252252,0.983235214508053,0.708914804154032,117
0.845,77.6923076923077,0.723259762308998,0.983410513459875,0.711261254217159,117
0.85,77.4358974358974,0.728886344116805,0.983878381369061,0.717135516451654,117
It is worth noting that each successive fun seems to be adding more backslashes and more quotation marks to the results.csv file.
Ideally I would like to be able to simply read in the results.csv file when it is done and analyse the data by accessing the columns with results$meanboot, or summary(results$meanboot) for example.
Could anyone offer some advice on how to modify the above code or offer an alternative solution?
I should add here that I purposefully did not go for the option of writing into the R script a loop that will run through the input files of interest and simply assemble a full table of results as an object (I am aware that this would be very simple to write out). This was because the work being done by this script will be farmed out to multiple machines in a cluster.
Thank you for your time and any help you might be able to offer.
The problem was solved by adding quote = FALSE to the write.table() call as per voidHead's suspicion.

Read SPecific lines of a CSV file in R-language

I am trying to write a code which manipulates data from a particular .csv and writes the data to another one.
I want to read each line one by one and perform the operation.
Also I am trying to read a particular line from the .csv but what I am getting is that line and the lines before it.
I am a beginner in R-Language, so I find the syntax a bit confusing.
testconn<=file("<path>")
num<-(length(readLines(testconn)))
for(i in 1:num){
num1=i-1
los<=read.table(file="<path>",sep=",",head=FALSE,skip=num1,nrows=1)[,c(col1,col2)]
write.table(los,"<path>",row.names=FALSE,quote=FALSE,sep=",",col.names=FALSE,append=TRUE)
}
This is the code I am currently using, thought it is giving the desiored output but it is extremely slow, my .csv data file has 43200 lines.
Your code doesn't work. You confuse the comparison operator <= and the assignment one <-
Your code is is extremly innefficient. You call both read.table and write.table 43200 times to read/write a single file.
You can simply do this:
los<- read.table(file="<path>",sep=",")[,c(col1,col2)]
res <- apply(los,1,function(x){## you treat your line here}
write.table(res,"<path_write>",row.names=FALSE,
quote=FALSE,sep=",",col.names=FALSE)

Resources