Merge rasters of different extents, sum overlapping cell values in R - r

I am trying to merge rasterized polylines which have differing extents, in order to create a single surface indicating the number of times cells overlap.
Due to computational constraints (given the size of my study area), I am unable to use extend and then stack for each raster (total count = 67).
I have come across the merge function in R, and this allows me to merge rasters together into one surface. It doesn't, however, seem to like me inserting a function to compute the sum of overlapping cells.
Maybe I'm missing something obvious, or this is a limitation of the merge function. Any advice on how to generate this output, avoiding extend & stack would be greatly appreciated!
Code:
# read in specific route rasters
raster_list <- list.files('Data/Raw/tracks/rasterized/', full.names = TRUE)
for(i in 1:length(raster_list)){
# get file name
file_name <- raster_list[i]
# read raster in
road_rast_i <- raster(file_name)
if(i == 1){
combined_raster <- road_rast_i
} else {
# merge rasters and calc overlap
combined_raster <- merge(combined_raster, road_rast_i,
fun = function(x, y){sum(x#data#values, y#data#values)})
}
}
Image of current output:
Image of a single route (example):
Image of fix:

Solved. There's a mosaic function, which allows the following:
combined_raster <- mosaic(combined_raster, road_rast_i, fun = sum)

Related

r Kohonen map - How to find position of one dataset?

I have a dataframe df with my data of interest
I rescale with
df.sc <- scale(df)
and make my Kohonen map with
df.grid <- somgrid(15, 10, "hexagonal")
df.som <- som(df.sc, rlen=700, grid = df.grid)
That works fine and I get a nice map.
Now I have an extra datapoint
extra.sc <- as.matrix(-0.29985191, -0.35905786, -0.260923297, -0.2415673150,
-0.259426676, -0.330404078)
It is scaled exactly the same way as df.sc
Now I want to see the position of the unit in the kohonen map given the df.som for the extra.sc
map(df.som,extra.sc)
does not give me what I want.
How can I determine the position of extra.sc within df.som? And preferentially also how I can mark it on the map
Maybe you defined your new data incorrectly, i.e. they did not have similar dimension with that of the training data. Check the output of extra.sc using parenthesis (extra.sc). I recommend that you provide the number of rows and columns to the definition of extra.sc using matrix() and c() function instead of as.matrix(). For example:
extra.sc <- matrix(c(0.29985191, -0.35905786, -0.260923297, -0.2415673150, -0.259426676, -0.330404078), nrow = 1, ncol = 6)`
and observe the result:
(extra.sc)
It is one row and six columns. If you do not provide the shape of your data, then R will regard them as one column and six rows.
extra.sc <- matrix(c(-0.29985191, -0.35905786, -0.260923297, -0.2415673150, -0.259426676, -0.330404078))
(extra.sc)

Speeding up a loop over rasters

I have a big dataset with 30000 rasters. My goal is to extract a mean value using the polygon located within the raster and create a file with extracted rasters values and dates from rasters filenames.
I succeeded in doing this by performing the following loop:
for (i in 1:length(rasters2014)){
a <- raster(rasters2014[i])
ext[i] <- as.vector(extract(a, poligon2, fun=mean, na.rm=TRUE, df=F))
}
output2 = data.frame(ext, filename=filename2014)
The problem is that the presented above loop takes about 2.5h hours to complete the calculation. Does anyone have an idea how I could speed up this process?
If your raster are all properly aligned (same ncol, nrow, extent, origin, resolution), you could try identifying the "cell numbers" to be extracted by looking on the first file, then
extracting based on those. This could speed-up the processing beacause raster does not need to compute which cells to extract. Something like this:
rast1 <- raster(rasters2014[1])
cells <- extract(rast1, poligon2, cellnumbers = TRUE, df = TRUE)[,"cells"]
ext <- list()
for (i in 1:length(rasters2014)){
a <- raster(rasters2014[i])
ext[[i]] <- as.vector(extract(a, cells, fun=mean, na.rm=TRUE, df=F))
}
Note that I am also using a list to store the results to avoid "growing" a vector, which is usually wasteful.
Alternatively, as suggested by #qdread, you could build a rasterStack using raster::stack(rasters2014, quick = TRUE) and call extract over the stack to avoid the for loop. Don't know which would be faster.
HTH
If your polygons do not overlap (and in most cases they don't) an alternative route is
library(raster)
x <- rasterize(poligon2, rasters2014[1])
s <- raster::stack(rasters2014, quick = TRUE)
z <- zonal(s, x, "mean")
PS: Faster is nicer, but I would suggest getting lunch while this runs.
Thanks for your help! I've tried all of the proposed solutions and the computation time generally the same regardless of the applied method. Therefore, I guess that it is just not possible to significantly speed up the computational process.

How to move to the next interaction in a loop if the previous one got an empty dataframe?

I want to extract from the online database OBIS all the species occurrence records for a group of polygons. The number of polygons is to large to get all of them at the same time so I thought to use a loop to achieve it. The problem I'm facing is due the fact that not all polygons have records so the result is an empty dataframe and the loop stops. I tried to use the control-flow "if" but is not working. Can I get what I need with a loop? Here is a shorter version of the shapefile I'm using.
library(robis)
library(maptools)
library(mregions)
library(plyr)
polygons <- readShapeSpatial("~/smaller.shp")
occurrence_list = list()
for (i in 1:length(polygons)){
wkt_polygons <- mr_as_wkt(polygons[i,])
occur <- occurrence(geometry=wkt_polygons)
if(is.null(occur) next
occur$i<-i
occurrence_list[[i]] <- occur
}
data <- dplyr::bind_rows(occurrence_list)
I'm not sure how the required result should look like, but this might work:
occurrence_list = list()
for (i in 1:length(polygons)){
wkt_polygons <- mr_as_wkt(polygons[i,])
occur <- occurrence(geometry=wkt_polygons)
if(nrow(occur) > 0) {
occur$i<-i
occurrence_list[[length(occurrence_list) + 1]] <- occur
}
}
data <- dplyr::bind_rows(occurrence_list)

R function to count coordinates

Trying to get it done via mapply or something like this without iterations - I have a spatial dataframe in R and would like to subset all more complicated shapes - ie shapes with 10 or more coordinates. The shapefile is substantial (10k shapes) and the method that is fine for a small sample is very slow for a big one. The iterative method is
Street$cc <-0
i <- 1
while(i <= nrow(Street)){
Street$cc[i] <-length(coordinates(Street)[[i]][[1]])/2
i<-i+1
}
How can i get the same effect in any array way? I have a problem with accessing few levels down from the top (Shapefile/lines/Lines/coords)
I tried:
Street$cc <- lapply(slot(Street, "lines"),
function(x) lapply(slot(x, "Lines"),
function(y) length(slot(y, "coords"))/2))
/division by 2 as each coordinate is a pair of 2 values/
but is still returns a list with number of items per row, not the integer telling me how many items are there. How can i get the number of coordinates per each shape in a spatial dataframe? Sorry I do not have a reproducible example but you can check on any spatial file - it is more about accessing low level property rather than a very specific issue.
EDIT:
I resolved the issue - using function
tail()
Here is a reproducible example. Slightly different to yours, because you did not provide data, but the principle is the same. The 'principle' when drilling down into complex S4 structures is to pay attention to whether each level is a list or a slot, using [[]] to access lists, and # for slots.
First lets get a spatial ploygon. I'll use the US state boundaries;
library(maps)
local.map = map(database = "state", fill = TRUE, plot = FALSE)
IDs = sapply(strsplit(local.map$names, ":"), function(x) x[1])
states = map2SpatialPolygons(map = local.map, ID = IDs)
Now we can subset the polygons with fewer than 200 vertices like this:
# Note: next line assumes that only interested in one Polygon per top level polygon.
# I.e. assumes that we have only single part polygons
# If you need to extend this to work with multipart polygons, it will be
# necessary to also loop over values of lower level Polygons
lengths = sapply(1:length(states), function(i)
NROW(states#polygons[[i]]#Polygons[[1]]#coords))
simple.states = states[which(lengths < 200)]
plot(simple.states)

performing a calculation with a `paste`d vector reference

So I have some lidar data that I want to calculate some metrics for (I'll attach a link to the data in a comment).
I also have ground plots that I have extracted the lidar points around, so that I have a couple hundred points per plot (19 plots). Each point has X, Y, Z, height above ground, and the associated plot.
I need to calculate a bunch of metrics on the plot level, so I created plotsgrouped with split(plotpts, plotpts$AssocPlot).
So now I have a data frame with a "page" for each plot, so I can calculate all my metrics by the "plot page". This works just dandy for individual plots, but I want to automate it. (yes, I know there's only 19 plots, but it's the principle of it, darn it! :-P)
So far, I've got a for loop going that calculates the metrics and puts the results in a data frame called Results. I pulled the names of the groups into a list called groups as well.
for(i in 1:length(groups)){
Results$Plot[i] <- groups[i]
Results$Mean[i] <- mean(plotsgrouped$PLT01$Z)
Results$Std.Dev.[i] <- sd(plotsgrouped$PLT01$Z)
Results$Max[i] <- max(plotsgrouped$PLT01$Z)
Results$75%Avg.[i] <- mean(plotsgrouped$PLT01$Z[plotsgrouped$PLT01$Z <= quantile(plotsgrouped$PLT01$Z, .75)])
Results$50%Avg.[i] <- mean(plotsgrouped$PLT01$Z[plotsgrouped$PLT01$Z <= quantile(plotsgrouped$PLT01$Z, .50)])
...
and so on.
The problem arises when I try to do something like:
Results$mean[i] <- mean(paste("plotsgrouped", groups[i],"Z", sep="$")). mean() doesn't recognize the paste as a reference to the vector plotsgrouped$PLT27$Z, and instead fails. I've deduced that it's because it sees the quotes and thinks, "Oh, you're just some text, I can't get the mean of you." or something to that effect.
Btw, groups is a list of the 19 plot names: PLT01-PLT27 (non-consecutive sometimes) and FTWR, so I can't simply put a sequence for the numeric part of the name.
Anyone have an easier way to iterate across my test plots and get arbitrary metrics?
I feel like I have all the right pieces, but just don't know how they go together to give me what I want.
Also, if anyone can come up with a better title for the question, feel free to post it or change it or whatever.
Try with:
for(i in seq_along(groups)) {
Results$Plot[i] <- groups[i] # character names of the groups
tempZ = plotsgrouped[[groups[i]]][["Z"]]
Results$Mean[i] <- mean(tempZ)
Results$Std.Dev.[i] <- sd(tempZ)
Results$Max[i] <- max(tempZ)
Results$75%Avg.[i] <- mean(tempZ[tempZ <= quantile(tempZ, .75)])
Results$50%Avg.[i] <- mean(tempZ[tempZ <= quantile(tempZ, .50)])
}

Resources