Bulk downsampling list of wave objects using seewave::resamp - r

I am trying to downsample a list of wave objects using seewave::resamp.
To get my list I have imported a .wav file and split it into 10 second clips following #Jota 's answer here
So to get my list of wave object I have done the following (this is using the example from the above answer):
library(seewave)
# your audio file (using example file from seewave package)
data(tico)
audio <- tico # this is an S4 class object
# the frequency of your audio file
freq <- 22050
# the length and duration of your audio file
totlen <- length(audio)
totsec <- totlen/freq
# the duration that you want to chop the file into
seglen <- 0.5
# defining the break points
breaks <- unique(c(seq(0, totsec, seglen), totsec))
index <- 1:(length(breaks)-1)
# a list of all the segments
subsamps <- lapply(index, function(i) audio[(breaks[i]*freq):(breaks[i+1]*freq)])
I now have my list of wave objects. If do the following for individual objects it works:
resamp(subsamps[[1]], f = 48000, g = 22050, output = "Wave")
But when I try and do it to the list of objects it comes up with an error:
test_wave_downsample <- lapply(subsamps, function(i) resamp(subsamps[[i]], f = 22050, g = 8000, output = "Wave"))
Error in subsamps[[i]] : invalid subscript type 'S4'
I am pretty sure this is something to do with the way I using lapply as the S4 object is not an issue when done individually, but as someone who is new to using the apply family I am not sure what.
I have had a look around an can't find much on using existing functions within lapply or if that can be an issue.
Any advice greatly appreciated.

After asking around I have been given the a solution that works:
f <- function(x) {
resamp(x, f = 22050, g = 8000, output = "Wave")
}
​
test_wave_downsample <- lapply(subsamps, f)

Related

How to efficiently parallelize EXTRACT function in raster package R

Given a netcdf file, I am trying to extract all pixels to form a data.frame for later export to .csv
a=brick(mew.nc)
#get coordinates
coord<-xyFromCell(a,1:ncell(a))
I can extract data for all pixels using extract(a,1:ncell(a)). However, I run into memory issues.
Upon reading through various help pages, I found that one can speed up things with:
beginCluster(n=30)
b=extract(a, coord)
endCluster()
But I still run out of memory. Our supercomputer has more than 1000 nodes, each node has 32 cores.
My actual rasterbrick has 400,000 layers
I am not sure how to parrallize this task without running into memory issues.
Thank you for all your suggestions.
Sample data of ~8MB can be found here
You can do something along these lines to avoid memory problems
library(raster)
b <- brick(system.file("external/rlogo.grd", package="raster"))
outfile <- 'out.csv'
if (file.exists(outfile)) file.remove(outfile)
tr <- blockSize(b)
b <- readStart(b)
for (i in 1:tr$n) {
v <- getValues(b, row=tr$row[i], nrows=tr$nrows[i])
write.table(v, outfile, sep = ",", row.names = FALSE, append = TRUE, col.names=!file.exists(outfile))
}
b <- readStop(b)
To parallelize, you could do this by layer, or groups of layers; and probably all values in one step for each subset of layers. Here for one layer at a time:
f <- function(d) {
filename <- extension(paste(names(d), collapse='-'), '.csv')
x <- values(d)
x <- matrix(x) # these two lines only needed when using
colnames(x) <- names(d) # a single layer
write.csv(x, filename, row.names=FALSE)
}
# parallelize this:
for (i in 1:nlayers(b)) {
f(b[[i]])
}
or
x <- sapply(1:nlayers(b), function(i) f(b[[i]]))
You should not be using extract. The question I have is what you would want such a large csv file for.

r raster cover function - error with landsat stacks

I'm running into an unusual issue with the cover function in R. We are trying to fill cloudy pixels with values from another layer. I can make it work just fine with stacks as follows -
library(raster)
r1 <- raster(ncols=36, nrows=18)
r1[] <- 1:ncell(r1)
r1b <- r1a <- r1
r1_stack <- stack(r1, r1a, r1b)
r2 <- setValues(r1, runif(ncell(r1)))
r2b <- r2a <- r2
r_stack <- stack(r2, r2a, r2b)
r_stack[r_stack < 0.5] <- NA
r3 <- cover(r_stack, r1_stack)
But then i try to do the same thing with a raster stack and i get the error:
Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'
The code:
# get all tifs
LS5_032_032_2008_09_21 <- list.files("LT050340302008090301T1-SC20170526100900/",
pattern = glob2rx("*band*.tif$"), full.names = T)
# stack bands
cloudy_scene <- stack(LS5_032_032_2008_09_21)
# import cloud mask
cloud_mask <- raster('LT050340302008090301T1-SC20170526100900/LT05_L1TP_034030_20080903_20160905_01_T1_sr_cloud_qa.tif')
# mask data
masked_data <- mask(cloudy_scene, mask = cloud_mask, maskvalue=0, inverse=TRUE)
####### get cloud free data
# get files
LS5_2008_09_19 <- list.files("LT050340302008091901T1-SC20170526101124/",
pattern = glob2rx("*band*.tif$"), full.names = T)
# subset and stack cloud free bands
cloud_free_data <- stack(LS5_2008_09_19)
# use cover function to assign NA pixels to corresponding pixels in other scene
cover <- cover(masked_data, cloud_free_data)
TRACEBACK() output:
9: toupper(format)
8: .defaultExtension(format)
7: .getExtension(filename, filetype)
6: .local(x, filename, ...)
5: writeStart(outRaster, filename = filename, format = format, datatype = datatype,
overwrite = overwrite)
4: writeStart(outRaster, filename = filename, format = format, datatype = datatype,
overwrite = overwrite)
3: .local(x, y, ...)
2: cover(masked_data, cloud_free_data)
1: cover(masked_data, cloud_free_data)
UPDATE: I tried to resample the data - still doesn't work
cloud_free_resam <- resample(cloud_free_data, masked_data)
cover <- cover(masked_data, cloud_free_resam)
ERROR:
Error in as.character(x) :
cannot coerce type 'closure' to vector of type 'character'
I also tried to crop both layers - same error
# find intersection boundary
crop_extent <- intersect(extent(cloud_free_data), extent(masked_data))
cloud_free_data <- crop(cloud_free_data, crop_extent)
masked_data <- crop(masked_data, crop_extent)
# use cover function to assign NA pixels to corresponding pixels in other scene
cover <- cover(masked_data, cloud_free_data)
GET THE DATA: (WARNING: 317mb download - unpacks to ~1gb)
https://ndownloader.figshare.com/files/8561230
Any ideas what might be causing this error with this particular dataset?
I'm sure we're missing something quite basic but...what?
Thank you in advance.
Leah
This is a bug. Cover with two multi-layered Raster* objects cannot write to disk. This can be seen in the simple example by setting
rasterOptions(todisk=TRUE)
I have fixed this in version 2-6.1 (forthcoming)
I think it is to do with when the object is converted to a rasterBrick and the raster written to a temporary file. i.e., masked_data <- mask(cloudy_scene, mask = cloud_mask)
Using Spacedman's crop creates a 'in memory' rasterBrick objects that do not rely on accessing the files; in that case the example works with no error.
But using the full extent (or cropping at end of process), raster writes and accesses (temporary) files and the error occurs.
Maybe as a temporary fix is to explicitly split up the images into memory sized chunks, mask/cover and then restitch with mosaic.
The extents of the two objects are also slightly different, so that should be fixed. Also generally a good idea to avoid issues by explicitly setting band names, min-max values and values < 0 to NA.

Igraph Write Communities

We are using igraph and R to detect communities in a network. The detection using cluster_walktrap is working great:
e <- cluster_walktrap(g)
com <-membership(e)
print(com)
write.csv2(com, file ="community.csv", sep=",")
The result is printed fine using print with the number and the community number that it belongs to but we have a problem in writing the result in the csv file and I have an error : cannot coerce class ""membership"" to a data.frame
How can I write the result of membership in a file ?
Thanks
Convert the membership object to numeric. write.csv and write.csv2 expect a data frame or matrix. The command tries to coerce the object into a data frame, which the class membership resists. Since membership really is just a vector, you can convert it a numeric. Either:
write.csv2(as.numeric(com), file ="community.csv")
Or:
com <- as.numeric(com)
write.csv2(com, file ="community.csv")
Oh, and you don't need the sep = "," argument for write.csv.
If you want to create table of vertex names/numbers and groups:
com <- cbind(V(g),e$membership) #V(g) gets the number of vertices
com <- cbind(V(g)$name,e$membership) #To get names if your vertices are labeled
I don't know if you guys resolved the problem but I did the following using R:
```
# applying the community method
com = spinglass.community(graph_builted,
weights = graph_builted$weights,
implementation = "orig",
update.rule = "config")
# creating a data frame to store the results
type = c(0)
labels = c(0)
groups = c(0)
res2 = data.frame(type, labels, groups)
labels = com$names # here you get the vertices names
groups = com$membership # here you get the communities indices
# here you save the information
res = data.frame(type = "spinGlass1", labels, groups)
res2 = rbind(res2, res)
# then you save the .csv file
write.csv(res2, "spinglass-communities.csv")
```
That resolves the problem for me.
Best regards.

Calculating percentile across netcdf files fails

I have five netcdf files where each file contains data for a time section. I want to calculate the 98th percentile for the whole timespan for each cell individually.
The accumulated file size for the netcdf files is around 250 MB.
My approach it this:
library(raster)
fileType="\\.nc$"
filenameList <- list.files(path=getwd(), pattern=fileType, full.names=F, recursive=FALSE)
#rasterStack for all layers
rasterStack <- stack()
#stack all data
for(i in 1:length(filenameList)){
filename <- filenameList[i]
stack.temp<-stack(filename)
rasterStack<-stack(rasterStack, stack.temp)
}
#calculate raster containing the 98th percentiles
result <- calc(rasterStack, fun = function(x) {quantile(x,probs = .98,na.rm=TRUE)} )
However, I get this error:
Error in ncdf4::nc_close(x#file#con) :
no slot of name "con" for this object of class ".RasterFile"
The stacking section of my code works, the crash happens during the calc function.
Do you have any idea where this might come from? Is it maybe an issue of where the data is stored (memory/disk)?
Strange, I generated some dummy data and it seems to work just fine, it does not seems to be your method. 250MB is not overly huge. I would clip a small piece of each raster and test if it works.
dat<-matrix(rnorm(16), 4, 4)
r1<-raster(dat)
r2<-r1*2
r3<-r2+1
r4<-r3+4
rStack <- stack(r1,r2,r3,r4)
result <- calc(rStack, fun = function(x) {quantile(x,probs = .98)} )
Perhaps this is related to the odd way you create a RasterStack. You should simply do:
filenames <- list.files(pattern="\\.nc$")
rasterStack <- stack(filenames)

Merging multiple rasters in R

I've been trying to find a time-efficient way to merge multiple raster images in R. These are adjacent ASTER scenes from the southern Kilimanjaro region, and my target is to put them together to obtain one large image.
This is what I got so far (object 'ast14dmo' representing a list of RasterLayer objects):
# Loop through single ASTER scenes
for (i in seq(ast14dmo.sd)) {
if (i == 1) {
# Merge current with subsequent scene
ast14dmo.sd.mrg <- merge(ast14dmo.sd[[i]], ast14dmo.sd[[i+1]], tolerance = 1)
} else if (i > 1 && i < length(ast14dmo.sd)) {
tmp.mrg <- merge(ast14dmo.sd[[i]], ast14dmo.sd[[i+1]], tolerance = 1)
ast14dmo.sd.mrg <- merge(ast14dmo.sd.mrg, tmp.mrg, tolerance = 1)
} else {
# Save merged image
writeRaster(ast14dmo.sd.mrg, paste(path.mrg, "/AST14DMO_sd_", z, "m_mrg", sep = ""), format = "GTiff", overwrite = TRUE)
}
}
As you surely guess, the code works. However, merging takes quite long considering that each single raster object is some 70 mb large. I also tried Reduce and do.call, but that failed since I couldn't pass the argument 'tolerance' which circumvents the different origins of the raster files.
Anybody got an idea of how to speed things up?
You can use do.call
ast14dmo.sd$tolerance <- 1
ast14dmo.sd$filename <- paste(path.mrg, "/AST14DMO_sd_", z, "m_mrg.tif", sep = "")
ast14dmo.sd$overwrite <- TRUE
mm <- do.call(merge, ast14dmo.sd)
Here with some data, from the example in raster::merge
r1 <- raster(xmx=-150, ymn=60, ncols=30, nrows=30)
r1[] <- 1:ncell(r1)
r2 <- raster(xmn=-100, xmx=-50, ymx=50, ymn=30)
res(r2) <- c(xres(r1), yres(r1))
r2[] <- 1:ncell(r2)
x <- list(r1, r2)
names(x) <- c("x", "y")
x$filename <- 'test.tif'
x$overwrite <- TRUE
m <- do.call(merge, x)
The 'merge' function from the Raster package is a little slow. For large projects a faster option is to work with gdal commands in R.
library(gdalUtils)
library(rgdal)
Build list of all raster files you want to join (in your current working directory).
all_my_rasts <- c('r1.tif', 'r2.tif', 'r3.tif')
Make a template raster file to build onto. Think of this a big blank canvas to add tiles to.
e <- extent(-131, -124, 49, 53)
template <- raster(e)
projection(template) <- '+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs'
writeRaster(template, file="MyBigNastyRasty.tif", format="GTiff")
Merge all raster tiles into one big raster.
mosaic_rasters(gdalfile=all_my_rasts,dst_dataset="MyBigNastyRasty.tif",of="GTiff")
gdalinfo("MyBigNastyRasty.tif")
This should work pretty well for speed (faster than merge in the raster package), but if you have thousands of tiles you might even want to look into building a vrt first.
You can use Reduce like this for example :
Reduce(function(...)merge(...,tolerance=1),ast14dmo.sd)
SAGA GIS mosaicking tool (http://www.saga-gis.org/saga_tool_doc/7.3.0/grid_tools_3.html) gives you maximum flexibility for merging numeric layers, and it runs in parallel by default! You only have to translate all rasters/images to SAGA .sgrd format first, then run the command line saga_cmd.
I have tested the solution using gdalUtils as proposed by Matthew Bayly. It works quite well and fast (I have about 1000 images to merge). However, after checking with document of mosaic_raster function here, I found that it works without making a template raster before mosaic the images. I pasted the example codes from the document below:
outdir <- tempdir()
gdal_setInstallation()
valid_install <- !is.null(getOption("gdalUtils_gdalPath"))
if(require(raster) && require(rgdal) && valid_install)
{
layer1 <- system.file("external/tahoe_lidar_bareearth.tif", package="gdalUtils")
layer2 <- system.file("external/tahoe_lidar_highesthit.tif", package="gdalUtils")
mosaic_rasters(gdalfile=c(layer1,layer2),dst_dataset=file.path(outdir,"test_mosaic.envi"),
separate=TRUE,of="ENVI",verbose=TRUE)
gdalinfo("test_mosaic.envi")
}
I was faced with this same problem and I used
#Read desired files into R
data_name1<-'file_name1.tif'
r1=raster(data_name1)
data_name2<-'file_name2.tif'
r2=raster(data_name2)
#Merge files
new_data <- raster::merge(r1, r2)
Although it did not produce a new merged raster file, it stored in the data environment and produced a merged map when plotted.
I ran into the following problem when trying to mosaic several rasters on top of each other
In vv[is.na(vv)] <- getValues(x[[i]])[is.na(vv)] :
number of items to replace is not a multiple of replacement length
As #Robert Hijmans pointed out, it was likely because of misaligned rasters. To work around this, I had to resample the rasters first
library(raster)
x <- raster("Base_raster.tif")
r1 <- raster("Top1_raster.tif")
r2 <- raster("Top2_raster.tif")
# Resample
x1 <- resample(r1, crop(x, r1))
x2 <- resample(r2, crop(x, r2))
# Merge rasters. Make sure to use the right order
m <- merge(merge(x1, x2), x)
# Write output
writeRaster(m,
filename = file.path("Mosaic_raster.tif"),
format = "GTiff",
overwrite = TRUE)

Resources