finding no data values inside the extent of a shapefile and discarding the values outside the extent - raster

I have Randolph Glacier Inventory boundary shapefiles of glaciers in Himachal Pradesh. I clipped three different rasters with these shapefiles, and then stacked them together . these clipped rasters contain no data values and I have to find the no data values and the pixel values inside these rasters . But when I am extracting the raster , the values am getting are more than that they should be .
for example , the area of a glacier / shapefile is 8.719 km. sq and the resolution of raster is 10 sq.m so accordingly number of pixel in the raster should be 87190, but I am getting approx. 333233(that may be because of the bounding box). So, I decided to create a binary mask so that I can get the values inside the boundary of the raster but still I am getting a lot more values than 87190 .
the stacked raster all have the same resolution and the same extent but still when I extract them and multiply the extracted array with the mask given below, the number of pixels extracted are different for two bands.
the code i used for making the binary mask is given below .
I want to write a code so that I can extract pixel values inside(only inside the boundary of the shapefile) the raster with the no data values present within them.
this is the mask I created
this is the shapefile of the same glacier
from rasterio.plot import reshape_as_image
import rasterio.mask
from rasterio.features import rasterize
from shapely.geometry import mapping, Point, Polygon
from shapely.ops import cascaded_union
shape_path= "E:\semester_4\glaciers of lahaul spiti clipped\RGIId_RGI60-14.11841/11841.shp"
glacier_df = gpd.read_file(shape_path)
raster_path =('E:/semester_4/glaciers of lahaul spiti clipped/RGIId_RGI60-14.11841/stack9/raster_stack_sv_srtm.tif')
with rasterio.open(raster_path , "r") as src:
raster_img = src.read()
raster_meta = src.meta
print("CRS Raster :{} , CRS Vector{}". format (glacier_df.crs , src.crs))
def poly_from_utm (polygon,transform):
poly_pts = []
poly = cascaded_union(polygon)
for i in np.array(poly.exterior.coords):
poly_pts.append(~transform* tuple(i))
new_poly = Polygon(poly_pts)
return new_poly
poly_shp = []
im_size = (src.meta['height'] , src.meta['width'])
for num , row in glacier_df.iterrows():
if row['geometry'].geom_type == 'Polygon':
poly = poly_from_utm(row['geometry'] , src.meta['transform'])
poly_shp.append(poly)
else:
for p in row['geometry']:
poly= poly_from_utm (p , src.meta['transform'])
poly_shp.append(poly)
mask_stack_sv_srtm = rasterize(shapes = poly_shp ,
out_shape = im_size)
plt.figure(figsize = (5, 5))
plt.imshow(mask_stack_sv_srtm)

Related

Issue to extract raster pixel values in different elevation zones in R

I have large daily datasets in raster format. I want to calculate total number of pixels based on values in different polygons in a single shapefile. The shapefile is STRM DEM 90 meter classified 24 elevation zones. These 24 elevation zones representing 24 polygons in single shapefile. I want to check how many pixels occupied by each polygon.
Primarily, I need to check the total number of pixels based on following values (200,210,240,250) in each polygon and finally store it into CSV.
I already developed code: But facing issue by end.
library(sp)
library(rgdal)
library(raster)
mod = raster("MOYDGL06_Maximum_Snow_Extent_2004097.tif")
shp= readOGR("Gilgit_DEM_24.shp")
mod_ext = extract(mod,shp,df=T,na.rm=T)
mod_mask = mask(mod,shp)
plot(r2,axes = TRUE,ext = extent(shp))
r3_200 = rasterToPoints(mod_mask,function(x){ x ==200 },spatial = TRUE)
r3_210 = rasterToPoints(mod_mask,function(x){ x ==210 },spatial = TRUE)
r3_240 = rasterToPoints(mod_mask,function(x){ x ==240 },spatial = TRUE)
r3_250 = rasterToPoints(mod_mask,function(x){ x ==250 },spatial = TRUE)
r3_200_1 = raster::intersect(shp,r3_200)
write.csv(r3_200_1,file = 'r2_extract_gilgit.csv')
Image and R code available in this link

Calculate slope over a gridded latitude/longitude coordinate area with corresponding depths in r

I have built a gridded area in the Gulf of Alaska with a resolution of 0.02 decimal degrees (~1nm);
library(sp)
library(rgdal)
# Set interval for grid cells.
my.interval=0.02 #If 1 is 1 degree, which is 60nm, than 0.1 is every 6nm, and 0.05 is every
3nm, so 0.0167 is every 1nm
# Select range of coordinates for grid boundaries (UTM to maintain constant grid cell area regardless of geographic location).
lonmin = -140.5083
lonmax = -131.2889
latmin = 53.83333
latmax = 59.91667
LON = seq(lonmin, lonmax, by=my.interval)
LAT = seq(latmin, latmax, by=my.interval)
# Compile series of points for grid:
mygrd = expand.grid(
Longitude = seq(lonmin, lonmax, by=my.interval),
Latitude = seq(latmin, latmax, by=my.interval)) %>%
#mutate(z=1:n()) %>%
data.frame
I exported that grid as a .csv file and brought it into ArcGIS where I used a few bathymetry rasters to extract the bottom depth at the midpoint of each cell. I then exported that from GIS back into R as a .csv file data frame. So now it has another column called "Depth" on it.
For now, I'll just add a column with random "depth" numbers in it:
mygrd$Depth<-NA
mygrd$Depth<-runif(nrow(mygrd), min=100, max=1000)
I would like to calculate the slope at the midpoint of each cell (between points).
I've been trying to do this with the slope() function in SDMTools package, which requires you to have a SpatialGridDataFrame in the sp package.
I can't get this to work; I am also not sure if this is the easiest way to do that?
I have a data frame with 3 columns: Longitude, Latitude, and Depth. I'd like to calculate slope. If anyone knows any better way to do this, let me know! Any help is much appreciated!
Here is some of the code I've been trying to use:
library(SDMTools)
proj <- CRS('+proj=longlat +datum=WGS84')
coords <- mygrd[,1:2]
t2 <- SpatialPointsDataFrame(coords=coords, data=mgrd proj4string=proj)
t2<-SpatialPixelsDataFrame(points=t2[c("Longitude","Latitude")], data=t1[,c(1,2)])
t3 <- SpatialGridDataFrame(grid=NULL, data=t2, proj4string=CRS("+proj=longlat +datum=WGS84"))
class(t3)
slope.test<-slope(t3, latlon=TRUE)

R Converting contour lines to elevation plot

I would like to be able to create an elevation plot from contour lines in R. I am very new to using shape files
At the moment I have downloaded data from here
which provides .shp files for all of the UK.
It also provides the contour lines, summarising the topology of the UK.
For the elevation plot I would like a data.frame or data.table of evenly spaced points (100m apart from each other) to produce a data output giving an x, y and z value. Where x and y represent the latitude and longitude (or Eastings and Northings), and z represent the height (in meters above sea-level).
I think there are probably some tools that will automatically carry out the interpolation for you, but am unsure how it would work with geo-spatial data.
This is my basic start...
require(maptools)
xx <- readShapeSpatial("HP40_line.shp")
Choose "ASCII Grid and GML (Grid)" as download format for the "OS Terrain 50" product, and download the file. This will give you a zip file containing many directories of zip files, each of which contains portions of a 50 m elevation grid of the UK (the portion I looked at had 200 x 200 cells, meaning 10 km x 10 km). I went into the directory data/su, unzipped the zip file there, and did
library(raster)
r = raster("SU99.asc")
plot(r)
to aggregate this to a 100 m grid, I did
r100 = aggregate(r) # default is factor 2: 50 -> 100 m
As mentioned above, the advice is to work on the grids as contour lines are derived from grids, working the other way around is a painful and a great loss of information.
Getting grid values in longitude latitude as a data.frame can be done in two ways:
df = as.data.frame(projectRaster(r, crs = CRS("+proj=longlat")), xy = TRUE)
unprojects the grid to a new grid in longitude / latitude. As these grids cannot coincide, it minimally moves points (see ?projectRaster).
The second option is to convert the grid to points, and unproject these to longitude latitude, by
df2 = as.data.frame(spTransform(as(r, "SpatialPointsDataFrame"), CRS("+proj=longlat")))
This does not move points, and as a consequence does not result in a grid.

How to get count of non-NA raster cells within polygon

I've been running into all sorts of issues using ArcGIS ZonalStats and thought R could be a great way. Saying that I'm fairly new to R, but got a coding background.
The situation is that I have several rasters and a polygon shape file with many features of different sizes (though all features are bigger than a raster cell and the polygon features are aligned to the raster).
I've figured out how to get the mean value for each polygon feature using the raster library with extract:
#load packages required
require(rgdal)
require(sp)
require(raster)
require(maptools)
# ---Set the working directory-------
datdir <- "/test_data/"
#Read in a ESRI grid of water depth
ras <- readGDAL("test_data/raster/pl_sm_rp1000/w001001.adf")
#convert it to a format recognizable by the raster package
ras <- raster(ras)
#read in polygon shape file
proxNA <- readShapePoly("test_data/proxy/PL_proxy_WD_NA_test")
#plot raster and shp
plot(ras)
plot(proxNA)
#calc mean depth per polygon feature
#unweighted - only assigns grid to district if centroid is in that district
proxNA#data$RP1000 <- extract(ras, proxNA, fun = mean, na.rm = TRUE, weights = FALSE)
#check results
head(proxNA)
#plot depth values
spplot(proxNA[,'RP1000'])
The issue I have is that I also need an area based ratio between the area of the polygon and all non NA cells in the same polygon. I know what the cell size of the raster is and I can get the area for each polygon, but the missing link is the count of all non-NA cells in each feature. I managed to get the cell number of all the cells in the polygon proxNA#data$Cnumb1000 <- cellFromPolygon(ras, proxNA)and I'm sure there is a way to get the actual value of the raster cell, which then requires a loop to get the number of all non-NA cells combined with a count, etc.
BUT, I'm sure there is a much better and quicker way to do that! If any of you has an idea or can point me in the right direction, I would be very grateful!
I do not have access to your files, but based on what you described, this should work:
library(raster)
mask_layer=shapefile(paste0(shapedir,"AOI.shp"))
original_raster=raster(paste0(template_raster_dir,"temp_raster_DecDeg250.tif"))
nonNA_raster=!is.na(original_raster)
masked_img=mask(nonNA_raster,mask_layer) #based on centroid location of cells
nonNA_count=cellStats(masked_img, sum)

location data format for adehabitat package

I have a file in this format:
ASCII format
The first rows look like this:
ncols 1440
nrows 720
xllcorner -180.0
yllcorner -90
cellsize 0.25
NODATA_value -9999
Basically I have the world with 1440 'tiles' in x direction (longitude) and 720 'tiles' in y direction (latitude). Each 'tile' is a square with a length of 0.25 degrees. I think I have xllcorner and yllcorner correct. I can draw this map like this in R:
library("adehabitat")
bio1 <- import.asc("D:/ENFA/data.asc")
maps <- as.kasc(list(data = bio1))
image(maps, col = cm.colors(256), clfac = list(Aspect = cl))
The map looks fine.
I would like to perform some ecological niche factor analysis (ENFA) using the adehabitat package and am not too sure about the location data. Basically I have them as longitudes and latitudes at the moment but I could also generate then as 'tile index' (e.g. lower left corner has the latitude -90 and longitude -180 so the 'tile index' would be 0, 0 - right?). Which is the correct location data format? I would use ENFA code like this:
locs <- read.table("D:/ENFA/Locs.txt", header = TRUE, sep="\t")
dataenfa1 <- data2enfa(maps, locs)
pc <- dudi.pca(dataenfa1$tab, scannf = FALSE)
enfa1 <- enfa(pc, dataenfa1$pr,scannf = FALSE)
hist(enfa1)
I would appreciate any comments please. Thanks in advance.
The problem with leaving your coordinates in lat-long form is that, at most places on earth, a degree of longitude has a different length than a degree of latitude. This might distort your ENFA by exaggerating distances in some directions relative to those in others.
Especially if your data are from a relatively small area, I'd suggest re-expressing the coordinates in meters along an W/E x-axis and S/N y-axis. If all of your points fall inside a single UTM zone, then you could do the conversion within R, using project() in the rgdal package:
Here's one example, taken from here:
library(rgdal)
# Make a two-column matrix, col1 = long, col2 = lat
xy <- cbind(c(118, 119), c(10, 50))
# Convert it to UTM coordinates (in units of meters)
project(xy, "+proj=utm +zone=51 ellps=WGS84")
[,1] [,2]
[1,] -48636.65 1109577
[2,] 213372.05 5546301
Much more info about how to manipulate spatial data is available in the "Applied Spatial Data Analysis with R" by Bivand, Pebesma, and Gomez-Rubio. If you need more specific assistance, try the R-sig-Geo mailing list.
Hope this helps.
Maybe you want to convert the coordinates into
GHAM (Global, Hierarchical, Alphanumeric, and Morton-encoded)
which represents the globe by cells of arbitrary precision (as fine or coarse as you wish), so any lat/lon has a single alpha-numeric address that remains sortable.
Here's the abstract from GHAM: A compact global geocode suitable for sorting, by Duncan Agnew:
The GHAM code is a technique for labeling geographic locations based
on their positions. It defines addresses for equal-area cells bounded
by constant latitude and longitude, with arbitrarily fine precision.
The cell codes are defined by applying Morton ordering to a recursive
division into a 16 by 16 grid, with the resulting numbers encoded into
letter–number pairs. A lexical sort of lists of points so labeled will
bring near neighbors (usually) close together; tests on a variety of
global datasets show that in most cases the actual closest point is
adjacent in the list 50% of the time, and within 5 entries 80% of the
time.
Source code is the IAMG repository, but if you can't access it I'm sure he would provide it.

Resources