How to properly crop() raster data extent in R - r

I'm trying to crop some raster data and do some calculations (getting the mean sea surface temperature, specifically).
However, when comparing cropping the extent of the raster data before doing the calculations gives me the same result as doing the calculations before cropping the resulting data.
The original extent of the raster data is -180, 180, -90, 90 (xmin, xmax, ymin, ymax), and I need to crop it to any desired region defined by latitude and longitude coordinates.
This is the script I'm doing tests with:
library(raster) # Crop raster data
library(stringr)
# hadsstR functions ----------------------------------------
load_hadsst <- function(file = "./HadISST_sst.nc") {
b <- brick(file)
NAvalue(b) <- -32768 # Land
return(b)
}
# Transform basin coordinates into numbers
morph_coords <- function(coords){
coords[1] = ifelse(str_extract(coords[1], "[A-Z]") == "W", - as.numeric(str_extract(coords[1], "[^A-Z]+")),
as.numeric(str_extract(coords[1], "[^A-Z]+")) )
coords[2] = ifelse(str_extract(coords[2], "[A-Z]") == "W", - as.numeric(str_extract(coords[2], "[^A-Z]+")),
as.numeric(str_extract(coords[2], "[^A-Z]+")) )
coords[3] = ifelse(str_extract(coords[3], "[A-Z]") == "S", - as.numeric(str_extract(coords[3], "[^A-Z]+")),
as.numeric(str_extract(coords[3], "[^A-Z]+")) )
coords[4] = ifelse(str_extract(coords[4], "[A-Z]") == "S", - as.numeric(str_extract(coords[2], "[^A-Z]+")),
as.numeric(str_extract(coords[4], "[^A-Z]+")) )
return(coords)
}
# Comparison test ------------------------------------------
hadsst.raster <- load_hadsst(file = "~/Hadley/HadISST_sst.nc")
x <- hadsst.raster
nms <- names(x)
months <- c("01","02","03","04","05","06","07","08","09","10","11","12")
coords <- c("85E", "90E", "5N", "10N")
coords <- morph_coords(coords)
years = 1970:1974
range = 5:12
# Crop before calculating mean
x <- crop(x, extent(as.numeric(coords[1]), as.numeric(coords[2]),
as.numeric(coords[3]), as.numeric(coords[4])))
xMeans <- vector(length = length(years)-1,mode='list')
for (ix in seq_along(years[1:length(years)])){
xMeans[[ix]] <- mean(x[[c(sapply(range,function(x) grep(paste0(years[ix],'.',months[x]),nms)))]], na.rm = T)
}
mean.brick1 <- do.call(brick,xMeans)
# Calculate mean before cropping
x <- hadsst.raster
xMeans <- vector(length = length(years)-1,mode='list')
for (ix in seq_along(years[1:length(years)])){
xMeans[[ix]] <- mean(x[[c(sapply(range,function(x) grep(paste0(years[ix],'.',months[x]),nms)))]], na.rm = T)
}
mean.brick2 <- do.call(brick,xMeans)
mean.brick2 <- crop(mean.brick2, extent(as.numeric(coords[1]), as.numeric(coords[2]),
as.numeric(coords[3]), as.numeric(coords[4])))
# Compare the two rasters
mean.brick1 - mean.brick2
This is the output of mean.brick1 - mean.brick2:
class : RasterBrick
dimensions : 5, 5, 25, 5 (nrow, ncol, ncell, nlayers)
resolution : 1, 1 (x, y)
extent : 85, 90, 5, 10 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84
data source : in memory
names : layer.1, layer.2, layer.3, layer.4, layer.5
min values : 0, 0, 0, 0, 0
max values : 0, 0, 0, 0, 0
As you can see, both RasterBricks are exactly the same, which should be impossible for any arbitrary choice of coordinates, as exemplified below with a small matrix:
Is there something I'm doing wrong? Cropping the data before doing calculations with them should unequivocally give me different results.

Ok, I'll continue from my post in your previous question:
We start out with the full hadsst.raster brick (which for having a reproducible example, can be fake created with the first part of my solution in my previous answer).
So this dataset has the dimensions 180, 360, 516, meaning 180 rows, 360 columns and 516 temporal layers.
Technically, a raster being a matrix, this could be how it looks like:
Just a bunch of matrix layers (516 to be precise), where each pixel is exactly aligned. Here I only have three example layers, the rest is indicated by the three dots.
So if we do temporal averaging, we basically extract all the values for a single pixel and take the mean (or any other averaging operation) of them. This is indicated here by the red squares.
This also shows why cropping does not influence the temporal averaging:
If we say the orange square is our extent of interest and we perform the cropping operation before the averaging, we basically discard all values around this square. After that we again take all the values for each pixel over all layers and perform our average.
This should make now clear, why it doesn't matter when you discard the pixel around the orange square. You could also calculate the average for them and discard the values afterwards, leaving you with just the values of your orange square. It just doesn't make any real sense if you're already sure you won't need them for further calculations.
Regardless, the values inside the square won't be affected.
When we talk about spatial averaging, it generally means averaging over pixel within a single layer, in this case probably over the values inside the orange rectangle.
Two common operations for that are
focal averaging (also known as neighbourhood averaging)
aggregation
The focal averaging will take will take for each pixel the average of all values of a defined number of adjacent pixels (most common is a 3x3square, where the pixel to be defined is the central one).
The aggregation is literally taking a number of pixel and combining them into a bigger pixel. This means that not only the value of this pixel will be averaged, but also that the resulting raster will have less individual pixels and a coarser resolution.
Alright, coming to the actual solution for you:
I assume you have an area of interest defined by an extent aoi:
aoi <- extent(xmin,xmax,ymin,ymax)
The first thing you would do is crop the initial brick to reduce the computational burden:
hadsst.raster_crp <- crop(hadsst.raster,aoi)
The next step is the temporal averaging, where we use the function I've defined in the solution from my other post:
hadsst.raster_crp_avg <- hadSSTmean(hadsst.raster_crp, 1969:2011, first.range = 11:12, second.range = 1:4)
Alright, now you have your temporal averages just for your region of interest. The next step depends on what your ultimate goal is.
As far as I understood, you just need a single average per temporal average for your region of interest.
If that is the case, it might be the right time to leave the actual raster domain and continue with base R:
res <- lapply(1:nlayers(hadsst.raster_crp_avg),function(ix) mean(as.matrix(hadsst.raster_crp_avg[[ix]])))
This will give you a list with as many elements as your brick hadsst.raster_crp_avg has.
Using lapply, we iterate through the layers, converting each layer into a matrix and then calculating the mean over all elements leaving us with a single value per averaged-timestep for the entire area of interest.
Going further you can use unlistto convert it to a vector and the add it to a data.frame or perform any other operation you like.
Hopefully that was clear and this is what you were looking for.
Best

Related

R: calculate Euclidean distance between two raster layers pixels

I have two raster layers for same area. I need to find the Euclidean distance between cell of coarse resolution raster and cell of fine resolution raster that fall within each cell of the pixels from my coarse resolution raster. For example:
The red square is the pixel of coarse resolution raster while the blue squares are the pixels of fine resolution raster. The black dot is the centroid of coarse resolution raster and the blue dots are the centroids of fine resolution raster.
There are similar questions posted, but the difference with my question is that I don't want to compute the nearest distances between raster cells.
My coarse resolution raster has a pixel size of 460m and my fine resolution raster of 100m. What I have done so far is to create point symbols from the centroids of the raster cells for both rasters. How can I compute the Euclidean distance between each coarse pixel and its corresponding fine pixels?
library(terra)
fr = rast("path/fine_image.tif") # fine resolution raster
cr = rast("path/coarse_image.tif") # coarse resolution raster
fr_p = as.points(fr,
values = T,
na.rm = T,
na.all = F) # fine resolution points
cr_p = as.points(cr,
values = T,
na.rm = T,
na.all = F) # coarse resolution points
I am not sure how to proceed from here. Any recommendations?
Here are my rasters:
fr = rast(ncols=108, nrows=203, nlyrs=1, xmin=583400, xmax=594200, ymin=1005700, ymax=1026000, names=c('B10_median'), crs='EPSG:7767')
cr = rast(ncols=23, nrows=43, nlyrs=1, xmin=583280, xmax=593860, ymin=1006020, ymax=1025800, names=c('coarse_image'), crs='EPSG:7767')
The solution came from the #michael answer and the output raster (after cropping and masking with a polygon shp) looks like this:
where the yellow squares are the cells from the coarse raster and the raster underneath it's the output from the code in the answer section.
This is a bit hacky but I think it might do what you want...
# Raster at fine resolution where values are cell indices
fr_cells <- fr
values(fr_cells) <- 1:ncell(fr)
# Second raster at fine resolution where values are indices of
# the surrounding coarse res cell (if there is one)
fr_cr <- fr
fr_xy <- xyFromCell(fr, 1:ncell(fr))
values(fr_cr) <- extract(cr, fr_xy, cells = TRUE)[, "cell"]
# Function to calculate distance given a pair of cell indices
fn <- function(x) {
fr_xy <- xyFromCell(fr, x[1])
cr_xy <- xyFromCell(cr, x[2])
sqrt( sum( (fr_xy - cr_xy)^2 ) )
}
fr_dist <- app(c(fr_cells, fr_cr), fun = fn)
You can use terra::distance for that
Example data
library(terra)
fr <- rast(ncols=108, nrows=203, nlyrs=1, xmin=583400, xmax=594200, ymin=1005700, ymax=1026000, names='B10_median', crs='EPSG:7767')
cr <- rast(ncols=23, nrows=43, nlyrs=1, xmin=583280, xmax=593860, ymin=1006020, ymax=1025800, names='coarse_image', crs='EPSG:7767')
Solution
pts <- as.points(cr, values=FALSE, na.rm=F)
crs(pts) <- crs(cr)
d <- distance(fr, pts)
Illustration
plot(d)
zoom(d, col=gray((1:255)/255))
lines(cr, col="red", lwd=2)
Note that this approach also computes the distance to the center of the nearest cell in cr for cells that are not covered by cr. You could remove those values with
dm <- mask(d, as.polygons(ext(cr)))
dm
#class : SpatRaster
#dimensions : 203, 108, 1 (nrow, ncol, nlyr)
#resolution : 100, 100 (x, y)
#extent : 583400, 594200, 1005700, 1026000 (xmin, xmax, ymin, ymax)
#coord. ref. : WGS 84 / Maharashtra (EPSG:7767)
#source(s) : memory
#name : B10_median
#min value : 0.000
#max value : 311.127

Looping dataframe values into a raster through time

I have a raster N showing the overall distribution of a species.
The raster cells have a value of 1 where the species is present, and a value of 0 otherwise.
I also have a data frame DF showing the relative biomass of this same species over time:
Biomass<-c(0.9, 1.2, 1.3)
Year<-c(1975, 1976, 1977)
DF<-c(Biomass, Year)
I would like to create (and save) a new raster for each year of my time series through a loop, where all my raster cells originally equal to 1 N[N==1] are replaced by the biomass value found in DF for that specific year.
For example, all the cells originally equaling 1 would be replaced by 0.9 and the raster would be saved as N-1975.
The idea would be to create a loop, but I cannot find anything on looping values of a dataframe into a raster.
I would like to end up with one raster per year "N-1975", "N-1976"...
Thank you !
What spatial information are you working with? If you have a simple xy coordinate system you can use rasterfromXYZ(df) to give you a raster layer for each column of your data frame, as long as your first two columns are x and y coordinates respectively. If you're using some other projection then you can specify it in the function: (https://rdrr.io/cran/raster/man/rasterFromXYZ.html)
#make som random data
x<-c(1,4,3,2,4)
y<-c(4,3,1,1,4)
#best to avoid only numbers as col names
X1975<- rnorm(5, 4,1)
X1976<- rnorm(5,5,1)
#make df
df<- cbind(x,y,X1975,X1976)
#make raster
biomass_raster <- rasterFromXYZ(df)
biomass_raster
#returns
class : RasterBrick
dimensions : 4, 4, 16, 2 (nrow, ncol, ncell, nlayers)
resolution : 1, 1 (x, y)
extent : 0.5, 4.5, 0.5, 4.5 (xmin, xmax, ymin, ymax)
crs : NA
source : memory
names : X1975, X1976
min values : 1.290337, 4.523350
max values : 4.413451, 6.512719
#plot all layers: plot(biomass_raster)
#access specific layer by calling biomass_raster$X1975
I ended up finding how to solve this issue, so I will post it here in case anybody runs into the same problem :)
N_loop <- N
years <- 1975:2020
for(i in seq(length(years))){
N_loop[N == 1] <- DF$Biomass[i]
writeRaster(N_loop, paste0("N", years[i], ".asc"), overwrite = TRUE)
}

How to efficiently count the number of spatial points within a certain distance around raster cells in R?

I would like to count the number of spatial points (of a SpatialPointsDataFrame object) within a certain distance to every cell of a RasterLayer in R. The resulting value should replace the original value of that particular raster cell.
Here is a reproducible example:
# load library
library(raster)
# generate raster
ras <- raster(nrow=18, ncol=36)
values(ras) <- NA
# create SpatialPointsDataFrame
x <- c(-160,-155,-153,-150, 30, -45, -44, -42, -40, 100, 110, 130)
y <- c(-75,-73,-71,-60, 0, 30, 35, 40, 41, 10, -10, 60)
z <- c(seq(1, 12, 1))
df <- data.frame(x,y,z)
spdf <- SpatialPointsDataFrame(coords=df[,c(1,2)],
data=as.data.frame(df[,3]),
proj4string=CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))
# visualize
plot(ras)
plot(spdf, add=T)
# loop over all raster cells
for(r in 1:nrow(ras)){
for(c in 1:ncol(ras)){
# duplicate raster for subsequent modification
ras_x <- ras
# define cell for which to count the number of surrounding points
ras_x[r,c] <- nrow(spdf) # some value that is impossible to be true, this is only a temporary placeholder
ras_x[ras_x != nrow(spdf)] <- NA
# convert raster cell to spatial point
spatial_point <- rasterToPoints(ras_x, spatial=T)
# calculate distance around raster cell
ras_dist <- distanceFromPoints(ras_x, spatial_point)
ras_dist <- ras_dist / 1000000 # scale values
# define circular zone by setting distance threshold (raster only with values 1 or NA)
ras_dist[ras_dist > 2] <- NA
ras_dist[ras_dist <= 2] <- 1
# create empty vector to count number of spatial points located within zone around the particular raster cell
empty_vec <- c()
# loop to check which value every point of SpatialPointsDataFrame corresponds to
for (i in 1:nrow(spdf)){
point <- extract(ras_dist, spdf[i,])
empty_vec[i] <- point
}
# sum of resulting vector is the number of points within surrounding zone around predefined raster cell
val <- sum(na.omit(empty_vec))
val
ras[r,c] <- val
# print for progress monitoring
print(paste0("sum of points within radius around cell row ", r, " and column ", c, " is ", val))
print(paste0("finished ", r, " out of ", nrow(ras)))
print(paste0("finished ", c, " out of ", ncol(ras)))
# both plots are just for visualization and progress monitoring
plot(ras)
plot(spdf, add=T)
}
}
plot(ras)
plot(spdf, add=T)
The resulting raster is exactly what I want but my way of checking the underlying raster values for each point of the SpatialPointsDataFrame seems inefficient. My real data consists of a RasterLayer with 2160, 4320, 9331200 (nrow, ncol, ncell) and a SpatialPointsDataFrame with 2664 features.
Is there a way to generate the raster of simply counting how many points are located within a certain distance around every raster cell more efficiently?
If you can work with projected coordinates this can be done fairly easily with the spatstat package.
This requires you to project your points (and grid) with e.g. sf::st_transform() and will not work
on a global scale.
Load spatstat and make 2000 random points to test against:
library(spatstat)
W <- square(1)
set.seed(42)
Y <- runifpoint(2000) # Random points in the unit square
plot(Y, main = "Random points in unit square")
Make 3000x3000 grid of points (9 million points):
xy <- gridcenters(W, 3000, 3000) # Grid of points in the unit square
X <- ppp(xy$x, xy$y, window = W, check = FALSE, checkdup = FALSE)
For each of the 9 million grid points count the number of other points within
radius 0.01 (timed on my resonably fast laptop with 16GB RAM):
system.time(counts <- crosspaircounts(X, Y, r = .01))
#> user system elapsed
#> 1.700 0.228 1.928
Convert to spatstat’s im-format (raster type format – can be converted with maptools) and plot:
rslt <- as.im(data.frame(x = xy$x, y = xy$y, counts))
plot(rslt, main = "Point counts in raster cells")
The points overlayed on the counts shows that we have done the right thing:
plot(rslt, main = "Point counts in raster cells")
plot(Y, add = TRUE, col = rgb(1,1,1,.7), pch = 3)
I’m sure you can also do something elegant and fast with raster, but I’m not the right one to ask there.

Measuring the distance of all points along a line in R (linestring in sf)

I have a smoothed line (a simplified abstraction of a coastline) that is a linestring and I want to measure the length of the line at frequent intervals along it. I can create the smoothed line, and measure its length:
library(raster)
library(sf)
library(tidyverse)
library(rnaturalearth)
library(smoothr)
# create bounding box for the line
xmin=-80
xmax=-66
ymin=24
ymax=45
bbox <- extent(xmin, xmax, ymin, ymax)
# get coarse outline of the US
usamap <- rnaturalearth::ne_countries(scale = "small", country = "united states of america", returnclass = "sf")[1] %>%
st_cast("MULTILINESTRING")
# crop to extent of the bbox and get rid of a border line that isn't coastal
bbox2 <- st_set_crs(st_as_sf(as(raster::extent(-80, -74, 42, 45.5), "SpatialPolygons")), st_crs(usamap))
cropmap <- usamap %>%
st_crop(bbox) %>%
st_difference(bbox2)
# smooth the line
smoothmap <- cropmap %>%
smoothr::smooth(method="ksmooth", smoothness=8)
# measure the line length
st_length(smoothmap) # I get 1855956m
I'm going to be "snapping" sample sites to points along this line, and I need to know how far they are along the coastline. However, I can't figure out how to measure the length of the line at intervals along the coastline. The interval size isn't terribly important so long as it's at relatively fine resolution, perhaps every 1km or every 0.01 degree.
What I would like to produce is a dataframe with x,y columns containing the lat/lon of points along the line, and a "length" column containing the distance along the line (from the origin to that point). Here are some of the things I've tried:
Iterating over bounding boxes. I tried cropping the line with smaller and smaller bounding boxes (using regular intervals in lat/lon), but because the line bends "backward" around -76, 38, some of the cropped boxes don't encompass the complete line segment that I expected. This approach works for the top right half of the line, but not for the bottom left half--that just returns lengths of zero.
Cropping the extent before implementing the smoother, and then measuring the line. Since the smoother function does not produce the same shape if it is only measuring a segment of the original line, this doesn't actually measure distance along the same line.
Getting the coordinates of the linestring with st_coordinates, trimming off one row (one point on the line), and recasting the remaining coordinates as a linestring. This approach does not produce a single linestring but instead a chain of points (since st_cast doesn't know how to connect them again), so it can't be measured normally.
It would be ideal to "edit" the geometry of smoothmap to delete one row of coordinates at a time, repeatedly measure the line, and write out the end point coordinates and the line length to a dataframe. However, I'm not sure if it's possible to edit a sf object's coordinates without turning it into a dataframe.
If I understand your question, I think you can do this:
x <- as(smoothmap, "Spatial")
g <- geom(x)
d <- pointDistance(g[-nrow(g), c("x", "y")], g[-1, c("x", "y")], lonlat=TRUE)
gg <- data.frame(g[, c('x','y')], seglength=c(d, 0))
gg$lengthfromhere <- rev(cumsum(rev(gg[,"seglength"])))
head(gg)
# x y seglength lengthfromhere
#1 -67.06494 45.00000 70850.765 1855956
#2 -67.74832 44.58805 2405.180 1785105
#3 -67.77221 44.57474 2490.175 1782700
#4 -67.79692 44.56095 2577.254 1780210
#5 -67.82248 44.54667 2666.340 1777633
#6 -67.84890 44.53189 2757.336 1774967
tail(gg)
# x y seglength lengthfromhere
#539 -79.09383 33.34224 2580.481 111543.5
#540 -79.11531 33.32753 2542.557 108963.0
#541 -79.13648 33.31306 2512.564 106420.5
#542 -79.15739 33.29874 2479.949 103907.9
#543 -79.17802 33.28460 101427.939 101427.9
#544 -80.00000 32.68751 0.000 0.0
I believe you need sf::st_line_sample:
# Transform to metric
smoothmap_utm <- st_transform(smoothmap, 3857)
# Get samples at every kilometer
smoothmap_samples <- st_line_sample(smoothmap_utm, density = 1/1000)
# Transform back to a sf data.frame
smoothmaps_points <- map(smoothmap_samples, function(x) data.frame(geometry = st_geometry(x))) %>%
map_df(as.data.frame) %>% st_sf() %>%
st_cast("POINT") %>%
st_set_crs(3857) %>%
st_transform(4326)
mapview(smoothmaps_points) + mapview(smoothmap)
Getting to your desired output:
# Function to transform sf to lon,lat
sfc_as_cols <- function(x, names = c("lon","lat")) {
stopifnot(inherits(x,"sf") && inherits(sf::st_geometry(x),"sfc_POINT"))
ret <- sf::st_coordinates(x)
ret <- tibble::as_tibble(ret)
stopifnot(length(names) == ncol(ret))
x <- x[ , !names(x) %in% names]
ret <- setNames(ret,names)
ui <- dplyr::bind_cols(x,ret)
st_set_geometry(ui, NULL)
}
smoothmaps_points_xy <- sfc_as_cols(smoothmaps_points) %>%
mutate(dist = cumsum(c(0, rep(1000, times = n() - 1))))
smoothmaps_points_xy
lon lat dist
1 -67.06836 44.99794 0
2 -67.07521 44.99383 1000
3 -67.08206 44.98972 2000
4 -67.08891 44.98560 3000
5 -67.09575 44.98149 4000
6 -67.10260 44.97737 5000
Important
But if your ultimate goal is to get points distance in a path, I would recommend checking rgeos::gProject.

Draw polygon from raster after occurrence modeling

I want to draw polygons for species occurrence using the same methods BIEN uses, so I can use both my polygons and theirs. They use Maxent to model species occurrence when they have more then occurrence points.
So, this is, for example, a BIEN polygon:
library(BIEN)
Mormolyca_ringens<- BIEN_ranges_load_species(species = "Mormolyca ringens")
#And this is a polygon, yes. A SpatialPolygonsDataFrame.
plot(wrld_simpl, xlim=c(-100,-40), ylim=c(-30,30), axes=TRUE,col="light yellow", bg="light blue")
plot(Mormolyca_ringens, col="green", add=TRUE)
Mormolyca ringens polygon
Ok, then I'm trying to draw my polygons because BIEN lacks some for species I need.
# first, you need to download the Maxent software here: http://biodiversityinformatics.amnh.org/open_source/maxent/
#and paste the "maxent.jar" file in the ’java’ folder of the ’dismo’ package, which is here:
system.file("java", package="dismo")
#You have to do this **before** loading the libraries
#install.packages("rJava")
library(rJava)
#If you get the message that cannot load this library, it's possible that your version of java is not 64bit.
#Go to Oracle and install Java for windows 64bit.
#If library still doesn't load: Look in your computer for the path where the java's jre file is and paste in the code below
Sys.setenv(JAVA_HOME="your\\path\\for\\jre") #mine is "C:\\Program Files\\Java\\jre1.8.0_144", for example
library(rJava)
library(dismo)
library(maptools)
#Giving credits: I wrote the following code based on this tutorial: https://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf
#Preparing the example data - the map
data(wrld_simpl)
ext = extent(-90, -32, -33, 23)
#Preparing the example data - presence data for Bradypus variegatus
file <- paste(system.file(package="dismo"), "/ex/bradypus.csv", sep="")
bradypus <- read.table(file, header=TRUE, sep=',')
bradypus <- bradypus[,-1] #don't need th first col
#Getting the predictors (the variables)
files <- list.files(path=paste(system.file(package="dismo"),
'/ex', sep=''), pattern='grd', full.names=TRUE )
predictors <- stack(files)
#making a training and a testing set.
group <- kfold(bradypus, 5)
pres_train <- bradypus[group != 1, ]
pres_test <- bradypus[group == 1, ]
#Creating the background
backg <- randomPoints(predictors, n=1000, ext=ext, extf = 1.25)
colnames(backg) = c('lon', 'lat')
group <- kfold(backg, 5)
backg_train <- backg[group != 1, ]
backg_test <- backg[group == 1, ]
# Running maxent
xm <- maxent(predictors, pres_train, factors='biome')
plot(xm)
#A response plot:
response(xm)
# Evaluating and predicting
e <- evaluate(pres_test, backg_test, xm, predictors)
px <- predict(predictors, xm, ext=ext, progress='text', overwrite=TRUE)
#Checking result of the prediction
par(mfrow=c(1,2))
plot(px, main='Maxent, raw values')
plot(wrld_simpl, add=TRUE, border='dark grey')
tr <- threshold(e, 'spec_sens')
plot(px > tr, main='presence/absence')
plot(wrld_simpl, add=TRUE, border='dark grey')
points(pres_train, pch='+')
At this point, I have the following image:
Prediction for example's occurrence
And I'm trying to make a polygon from this raster with this code:
predic_pol<-rasterToPolygons(px )
And also:
px_rec<-reclassify(px, rcl=0.5, include.lowest=FALSE)
px_pol<-rasterToPolygons(px_rec)
But i keep getting a pixels version of my extent
Can you please give me a hint so I can extract a polygon out of this raster, like the BIEN's one? (Also I'm new to modeling and to R... any tips are welcome)
EDIT: this is the px console output:
> px
class : RasterLayer
dimensions : 172, 176, 30272 (nrow, ncol, ncell)
resolution : 0.5, 0.5 (x, y)
extent : -120, -32, -56, 30 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
data source : C:\Users\thai\Documents\ORCHIDACEAE\Ecologicos\w2\predictions\Trigonidiumobtusum_prediction.grd
names : layer
values : 6.705387e-06, 0.9999983 (min, max)
Thank you in advance
Edit 2: Solution
Thanks to #Val I got to this:
#Getting only the values>tr to make the polygon
#"tr" is what gives me the green raster instear of the multicolour one
pol <- rasterToPolygons(px>tr,function(x) x == 1,dissolve=T)
#Ploting
plot(wrld_simpl, xlim=c(-120,-20), ylim=c(-60,10), axes=TRUE,col="light yellow", bg="light blue")
plot(pol, add=T, col="green")
And now I have what I wanted! Thank you!
(The polygon is not the same in the figures only because I used a different data set I had at my environment at the moment I got #Val 's answer)
Bonus question:
Do you know how to smooth the edges so I get a non pixelized polygon?
I don't know BIEN, so I din't really look at this part of your example. I just generalized your problem/question down to the following:
You have a binary raster (with 0 for absence and 1 for presence) and you want to convert all areas with 1 to a polygon.
As for your px raster, it's a bit odd that your values are not 0 and 1 but more basically 0 and basically 1. But if that's a problem, that can be an easy fix.
So I tried to recreate your example with just the area of Brasil:
library(raster)
library(rgeos)
# get Brasil borders
shp <- getData(country = 'BRA',level=0)
#create binary raster
r <- raster(extent(shp),resolution=c(0.5,0.5))
r[] <- NA # values have to be NA for the buffering
# take centroid of Brasil as center of species presence
cent <- gCentroid(shp)
# set to 1
r[cellFromXY(r,cent)] <- 1
# buffer presence
r <- buffer(r,width=1000000)
# set rest 0
r[is.na(r)] <- 0
# mask by borders
r <- mask(r,shp)
This is close enough to your raster I guess:
So now to the conversion to the polygon:
pol <- rasterToPolygons(r,function(x) x == 1,dissolve=T)
I use a function to only get pixels with value 1. Also I dissolve the polygons to not have single pixel polygons but rather an area. See rasterToPolygons for other options.
And now plot the borders and the new polygon together:
plot(shp)
plot(pol,col='red',add=T)
And there you have it, a polygon of the distribution. This is the console output:
> pol
class : SpatialPolygonsDataFrame
features : 1
extent : -62.98971, -43.48971, -20.23512, -1.735122 (xmin, xmax, ymin, ymax)
coord. ref. : NA
variables : 1
names : layer
min values : 1
max values : 1
Hope that helps!
Edit: Bonus answer
You have to be clear, that the pixelized boundaries of your polygon(s) represent an accurate representation of your data. So any change to that means a loss of precision. Now, depending on your purpose, that might not matter.
There's multiple ways to achieve it, either at the raster side with disaggregating and smoothing/filtering etc. or at the polygon side, where you can apply specific filters to the polygons like this.
If it's purely aesthetic, you can try gSimplify from the rgeos package:
# adjust tol for smoothness
pol_sm <- gSimplify(pol,tol=0.5)
plot(pol)
lines(pol_sm,col='red',lwd=2)

Resources