.wav file length/duration without reading in the file - r

Is there a way to extract the information about .wav file length/duration without having to read in the file in R? I have thousands of those files and it would take a long time if I had to read in every single one to find its duration. Windows File Explorer gives you and option to turn on the Length field and you can see the file duration, but is there a way to extract that information to be able to use in in R?
This is what I tried and would like to avoid doing since reading in tens of thousands of audio files in R will take a long time:
library(tuneR)
audio<-readWave("AudioFile.wav")
round(length(audio#left) / audio#samp.rate, 2)

You can use the readWave function from the tuneR package with header=TRUE. This will only head the metadata of the file and not the entire file.
library(tuneR)
audio<-readWave("AudioFile.wav", header=TRUE)
round(audio$samples / audio$sample.rate, 2)

Related

Is there a way to compare the structure/architecture of .nc files in R?

I have a sample .nc file that contains a number of variables (5 to be precise) and is being read into a program. I want to create a new .nc file containing different data (and different dimensions) that will also be read into that program.
I have created a .nc file that looks the same as my sample file (I have included all of the necessary attributes for each of the variables that were included in the original file).
However, my file is still not being ingested.
My question is: is there a way to test for differences in the layout/structure of .nc files?
I have examined each of the variables/attributes within Rstudio and I have also opened them in panoply and they look the same. There are obviously differences (besides the actual data that they contain) since the file is not being read.
I see that there are options to compare the actual data within .nc files online (Comparison of two netCDF files), but that is not what I want. I want to compare the variable/attributes names/states/descriptions/dimensions to see where my file differs. Is that possible?
The ideal situation here would be to create a .nc template from the variables that exist within the original file and then fill in my data. I could do this by defining the dimensions (ncdim_def), creating the file(nc_create), getting my data (ncvar_get) and putting it in the file (ncvar_put), but that is what I have done so far, and it is too reliant on me not making an error (which I obviously have as they are not the same).
If you are on unix this is more easily achieved using CDO. See the Information section of the reference card: https://code.mpimet.mpg.de/projects/cdo/embedded/cdo_refcard.pdf.
For example, if you wanted to check that the descriptions are the same in files just do:
cdo griddes example1.nc
cdo griddes example2.nc
You can easily use system in R, to wrap around this.

Importing to R an Excel file saved as web-page

I would like to open an Excel file saved as webpage using R and I keep getting error messages.
The desired steps are:
1) Upload the file into RStudio
2) Change the format into a data frame / tibble
3) Save the file as an xls
The message I get when I open the file in Excel is that the file format (excel webpage format) and extension format (xls) differ. I have tried the steps in this answer, but to no avail. I would be grateful for any help!
I don't expect anybody will be able to give you a definitive answer without a link to the actual file. The complication is that many services will write files as .xls or .xlsx without them being valid Excel format. This is done because Excel is so common and some non-technical people feel more confident working with Excel files than a csv file. Now, the files will have been stored in a format that Excel can deal with (hence your warning message), but R's libraries are more strict and don't see the actual file type they were expecting, so they fail.
That said, the below steps worked for me when I last encountered this problem. A service was outputting .xls files which were actually just HTML tables saved with an .xls file extension.
1) Download the file to work with it locally. You can script this of course, e.g. with download.file(), but this step helps eliminate other errors involved in working directly with a webpage or connection.
2) Load the full file with readHTMLTable() from the XML package
library(XML)
dTemp = readHTMLTable([filename], stringsAsFactors = FALSE)
This will return a list of dataframes. Your result set will quite likely be the second element or later (see ?readHTMLTable for an example with explanation). You will probably need to experiment here and explore the list structure as it may have nested lists.
3) Extract the relevant list element, e.g.
df = dTemp[2]
You also mention writing out the final data frame as an xls file which suggests you want the old-style format. I would suggest the package WriteXLS for this purpose.
I seriously doubt Excel is 'saved as a web page'. I'm pretty sure the file just sits on a server and all you have to do is go fetch it. Some kind of files (In particular Excel and h5) are binary rather than text files. This needs an added setting to warn R that it is a binary file and should be handled appropriately.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download.file(url=myurl, destfile="localcopy.xlsx", mode="wb")
or, for use downloader, and ty something like this.
myurl <- "http://127.0.0.1/imaginary/file.xlsx"
download(myurl, destfile="localcopy.csv", mode="wb")

Error: Invalid: File is too small to be a well-formed file - error when using feather in R

I'm trying to use feather (v. 0.0.1) in R to read a fairly large (3.5 GB) csv file with 21178665 rows and 16 columns.
I use the following lines to load the file:
library(feather)
path <- "pp-complete.csv"
df <- read_feather(path)
But I get the following error:
Error: Invalid: File is too small to be a well-formed file
There's no explanation in the documentation of read_feather so I'm not sure what's the problem. I guess this function expects a different file form but I'm not sure what that would be.
Btw, I can read the file with read_csv in readr library but it takes a while.
The feather file format is distinct from a CSV file format. They are not interchangeable. The read_feather function cannot read simple CSV files.
If you want to read CSV files quickly, your best bets are probably readr::read_csv or data.table::fread. For large files, it will still usually take a while just to read it from disc.
After you've loaded the data into R, you can create a file in the feather format with write_feather so you can read it with read_feather the next time.

Read a .cdb file into R very slow

I have .cdb files in binary format in sizes of ~630M.
I use read.cdb(file, type='cdb') from library cdb to read them in but it takes for ever to load. (20 min+)
Is this normal for a large file like this?

R crashes when trying to save list object ~1gb in size

I have a list object that includes many .wav files. Collectively, the .wav files are ~1gb in size. I've been trying to save the list object in a .Rdata file, like this:
save(my_list_of_wavs, file = 'wavs.Rdata')
After running this function, R does not respond and I need to force R to quit. Is there any other function or workaround I can use to save the .Rdata file?
I agree with #IShouldBuyABoat, you might need to wait more.
To make saving faster you can turn off compression:
save(my_list_of_wavs, file = 'wavs.Rdata', compress=FALSE)

Resources