how to plot geotiff data in python by imshow - plot

I have a geotiff raster data sets with elevation data init. No data in raster image is defined by -9999. When I try to make a plot with this code below:
import gdal
import numpy as np
from mayavi import mlab
ds = gdal.Open('data.tif')
dem = ds.ReadAsArray()
gt = ds.GetGeoTransform()
ds = None
mlab.imshow(dem)
mlab.colorbar()
mlab.show()
The problem is when I make a plot, it also plot nodata value. My question is how do I exclude -9999 value (or select value range to plot) from raster image.
The link to data is:
https://drive.google.com/file/d/0B2rkXkOkG7ExR1VsVW5HQXBhSDQ/view?usp=sharing

In case you still needed a clean solution to this question, I believe what you are looking for is a masked array from numpy.ma like:
import gdal
import numpy as np
from mayavi import mlab
ds = gdal.Open('data.tif')
dem = ds.ReadAsArray()
msk = dem==-9999 # boolean array with True at elements to be masked
dem = np.ma.array(data=dem, mask=msk, fill_value=np.nan)
gt = ds.GetGeoTransform()
ds = None
mlab.imshow(dem)
mlab.colorbar()
mlab.show()

Related

finding no data values inside the extent of a shapefile and discarding the values outside the extent

I have Randolph Glacier Inventory boundary shapefiles of glaciers in Himachal Pradesh. I clipped three different rasters with these shapefiles, and then stacked them together . these clipped rasters contain no data values and I have to find the no data values and the pixel values inside these rasters . But when I am extracting the raster , the values am getting are more than that they should be .
for example , the area of a glacier / shapefile is 8.719 km. sq and the resolution of raster is 10 sq.m so accordingly number of pixel in the raster should be 87190, but I am getting approx. 333233(that may be because of the bounding box). So, I decided to create a binary mask so that I can get the values inside the boundary of the raster but still I am getting a lot more values than 87190 .
the stacked raster all have the same resolution and the same extent but still when I extract them and multiply the extracted array with the mask given below, the number of pixels extracted are different for two bands.
the code i used for making the binary mask is given below .
I want to write a code so that I can extract pixel values inside(only inside the boundary of the shapefile) the raster with the no data values present within them.
this is the mask I created
this is the shapefile of the same glacier
from rasterio.plot import reshape_as_image
import rasterio.mask
from rasterio.features import rasterize
from shapely.geometry import mapping, Point, Polygon
from shapely.ops import cascaded_union
shape_path= "E:\semester_4\glaciers of lahaul spiti clipped\RGIId_RGI60-14.11841/11841.shp"
glacier_df = gpd.read_file(shape_path)
raster_path =('E:/semester_4/glaciers of lahaul spiti clipped/RGIId_RGI60-14.11841/stack9/raster_stack_sv_srtm.tif')
with rasterio.open(raster_path , "r") as src:
raster_img = src.read()
raster_meta = src.meta
print("CRS Raster :{} , CRS Vector{}". format (glacier_df.crs , src.crs))
def poly_from_utm (polygon,transform):
poly_pts = []
poly = cascaded_union(polygon)
for i in np.array(poly.exterior.coords):
poly_pts.append(~transform* tuple(i))
new_poly = Polygon(poly_pts)
return new_poly
poly_shp = []
im_size = (src.meta['height'] , src.meta['width'])
for num , row in glacier_df.iterrows():
if row['geometry'].geom_type == 'Polygon':
poly = poly_from_utm(row['geometry'] , src.meta['transform'])
poly_shp.append(poly)
else:
for p in row['geometry']:
poly= poly_from_utm (p , src.meta['transform'])
poly_shp.append(poly)
mask_stack_sv_srtm = rasterize(shapes = poly_shp ,
out_shape = im_size)
plt.figure(figsize = (5, 5))
plt.imshow(mask_stack_sv_srtm)

geopandas plotting - Identify locations that fall outside of the map

I have a shapefile that shows the map of Pakistan at district level. I also have a geodataframe that has information about polling stations in Pakistan.
I have mapped the geodataframe on to the shapefile, but noticed that some lat/lon values from the geodataframe are wrong i.e. they lie outside Pakistan.
I want to identify which polling stations these are. (I want to select those rows from the geodataframe) Is there a way to do this?
Please see below for reference - the black dots indicate polling stations, and the colourful map is the map of Pakistan at district levels:
image_pakistan_map_pollingstations
edit:
So I'm trying this and it seems to work, however it's taking a very long time to run (been running it for 5+ hrs now) - for reference, the geodataframe has about 50,000 rows and it's called ours_NA_gdf.
for i in range(len(ours_NA_gdf)):
if ours_NA_gdf['geometry'][i].within(pakistan['geometry'][0]):
ours_NA_gdf.at[i, 'loc_validity'] = 'T'
else:
ours_NA_gdf.at[i, 'loc_validity'] = 'F'
ours_NA_gdf[ours_NA_gdf['loc_validity']=='F']
I suspect that the geometries of Pakistan you use are the problem. They are too complex and detailed to use. In your use-case, simple geometry provided by naturalearth_lowres should give better performance. Here I provide a runnable code that demonstrates the use of simple Pakistan geometry to perform contains() operations, and assign properties color of points to plot on the map.
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from cartopy import crs as ccrs
# create a geoDataFrame of points locations across Pakistan areas
pp = 40
lons = np.linspace(60, 80, pp)
lats = np.linspace(22, 39, pp)
# create point geometry
# points will be plotted across Pakistan in red (outside) and green (inside)
points = [Point(xy) for xy in zip(lons, lats)]
# create a dataframe of 3 columns
mydf = pd.DataFrame({'longitude': lons, 'latitude': lats, 'point': points})
# manipulate dataframe geometry
gdf = mydf.drop(['longitude', 'latitude'], axis=1)
gdf = gpd.GeoDataFrame(gdf, crs="EPSG:4326", geometry=gdf.point)
fig, ax = plt.subplots(figsize=(6,7), subplot_kw={'projection': ccrs.PlateCarree()})
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
parki = world[(world.name == "Pakistan")] #take a country of interest
# grab the geometry of Pakistan
# can .simplify() it if need be
pg = parki['geometry']
newcol = []
for index, row in gdf.iterrows(): # Looping over all points
res = pg.contains( row.geometry).values[0]
newcol.append(res)
# add a new column ('insideQ') to the geodataframe
gdf['insideQ'] = newcol
# add a new column ('color') to the geodataframe
gdf.loc[:, 'color'] = 'green' #set color='green'
# this set color='red' to selected rows
gdf.loc[gdf['insideQ']==False, 'color'] = 'red'
# plot Pakistan
ax.add_geometries(parki['geometry'], crs=ccrs.PlateCarree(), color='lightpink', label='Pakistan')
# plot all points features of `gdf`
gdf.plot(ax=ax, zorder=20, color=gdf.color)
ax.set_extent([60, 80, 22, 39]) #zoomed-in to Pakistan
LegendElement = [
mpatches.Patch(color='lightpink', label='Pakistan')
]
ax.legend(handles = LegendElement, loc='best')
plt.show()
The output plot:

R raster: the cropped raster had different color (brigtness) from original one?

I would like to crop a multiband raster (4 bands) by spatial polygons (in SpatialPolygonsDataFrame). When I displayed the original and cropped rasters in QGIS, I found that the cropped raster had different colours from the original one. Here is my code:
library(raster)
mosaic_shp <- shapefile("mo_clipper.shp")
mosaic <- brick('orthomosaic.tif')
mosaic_sub <- crop(mosaic, extent(mosaic_shp))
writeRaster(mosaic_sub, 'mosaic_sub.tif', format = "GTiff", overwrite = TRUE)
Partial cropped raster and the corresponding part in original raster in QGIS:
I have no idea how to deal with this issue, any help will be appreciated.
After comparing the two rasters carefully in QGIS, I have found the answer. The issue is related to dataType argument in the writeRaster function. So we just need to modify the code like:
library(raster)
mosaic_shp <- shapefile("mo_clipper.shp")
mosaic <- brick('orthomosaic.tif')
mosaic_sub <- crop(mosaic, extent(mosaic_shp))
data_type <- unique(dataType(mosaic)) # get data type from original raster;
writeRaster(mosaic_sub, 'mosaic_sub.tif', format = "GTiff", overwrite = TRUE,
datatype = data_type) # set datatype;

Importing elevation data into geoviews/holoviews (data format)

How do you import other raster formats into geoviews/holoviews other than the netCDF files shown in the tutorials? For example I want to import an ESRI .bil file and wrap it into a geoviews image but it is not importing correctly. Is there a way to do this, am I missing a step?
Ultimately I want to be able to project point data onto the raster image at the correct lat, long positions to extract data, but I can't seem to get the points projected in the correct positions.
from cartopy import crs
import holoviews as hv
import xarray as xr
import geoviews as gv
import pandas as pd
from osgeo import gdal
kdims = ['easting', 'northing']
vdims = ['elevation']
xr_raster = gv.Dataset(DEM, kdims=kdims, vdims=vdims, crs=crs.PlateCarree())
image = gv.Image(xr_raster, crs=crs.PlateCarree()) (style={'cmap':'inferno'})
image
This provides this error: TypeError: shape() got an unexpected keyword argument 'gridded'
import iris
iris_raster = iris.load_cube('/path/to/raster.bil')
This errors as: ValueError: No format specification could be found for the given buffer.
I have only succeeded in importing the data using gdal and then plotting it as a Holoviews Raster or Geoviews Image:
Test = gdal.Open(Data_path + "DEM.bil")
DEM = Test.ReadAsArray()
hv.notebook_extension()
%opts Raster [xrotation=20] Points (color='r')[xrotation=20]
#Import point data
df = pd.read_csv(path+name)
stream_dataset = gv.Dataset(df, kdims=['x', 'y'], vdims=['elevation', 'chi', 'flow_distance'])
stream = hv.Points(stream_dataset)
#Import raster
raster = hv.Raster(DEM)(style={'cmap':'inferno'})
image = gv.Image(DEM, crs=crs.PlateCarree()) (style={'cmap':'inferno'})
raster + stream + image
This is the result:
Results

NetCDF - converting into raster and projection issues

I have the following NetCDF file - I am trying to convert into raster but something is not right. The projection of the NetCDF file is not given but based on the software I received it from it should LatLong but might be cylindrical equal area. I tried both, but I keep getting this distortion which makes it impossible to query for the values at the right locations. I know the spacing of the grid is not even, not sure if that affects the end result (here visual from ArcGIS but in R it is the same problem unless plotted with levelplot function).
library(raster)
library(ncdf4)
library(lattice)
library(RColorBrewer)
setwd("D:/Results")
climexncdf <- nc_open("ResultsSO_month.nc")
lon <- ncvar_get(climexncdf,"Longitude")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(climexncdf,"Latitude")
nlat <- dim(lat)
head(lat)
dname <- "Weekly Growth Index"
t <- ncvar_get(climexncdf,"Step")
tmp_array <- ncvar_get(climexncdf,dname)
tmp_stack <- vector("list",length(t))
for (i in 1:length(t)) {
tmp_stack[[i]] <- tmp_array[,,i]
}
YearData <- vector("list",52)
for (i in 1:4) {
YearData[[i]] <- tmp_array[,,i]
}
Month1 <- YearData[c(1,2,3,4)]
# Calculate monthly averages
M1Avg <- Reduce("+",Month1)/length(Month1)
# Replace 0's with NA's
M1Avg[M1Avg==0] <- NA
# Piece of code that gives me what I need:
grid <- expand.grid(lon=lon, lat=lat)
cutpts <- seq(0,1,0.1)
# Convert to raster - work to include lat and long
M1Avg_reorder <- M1Avg[ ,order(lat) ]
M1Avg_reorder <- apply(t(M1Avg_reorder),2,rev)
M1AvgRaster <- raster(M1Avg_reorder,
xmn=min(lon),xmx=max(lon),
ymn=min(lat),ymx=max(lat),
crs=CRS("+proj=longlat +datum=WGS84"))
#crs=CRS("+proj=cea +lat_0=0 +lon_0=0"))
r <- projectRaster(M1AvgRaster,crs=CRS("+proj=longlat +datum=WSG84"))
plot(M1AvgRaster)
# Location file not included but any locations can be entered
locations <- read.csv("Locations.csv", header=T)
coordinates(locations) <- c("y","x")
data <- extract(M1AvgRaster,locations)
writeRaster(M1AvgRaster, "M1AvgRaster_Globe_projWGSTest", format = "GTiff")
the python version shows that after reordering at least the location of data seems correct. However, the data file seems strange, I saw data actually getting corrupted in the python netcdf library, which I've never seen before with quite a lot of different NetCDF files. Also, the chunking and compression settings are strange, better not to apply them at all.
But minimal python example to get the plot is here:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from netCDF4 import Dataset
ff = Dataset('ResultsSO_month.nc')
test_var = np.copy(ff.variables['Maximum Temperature'][:])
## reorder latitudes
latindex = np.argsort(ff.variables['Latitude'][:])
## Set up map and compute map coordinates
m = Basemap(projection='cea', llcrnrlat=-90, urcrnrlat=90,
llcrnrlon=-180, urcrnrlon=180, resolution='c')
grid_coords = np.meshgrid(ff.variables['Longitude'[:],ff.variables['Latitude'][latindex])
X,Y = m(grid_coords[0],grid_coords[1])
## Plot
m.pcolormesh(X,Y,test_var[0,latindex,:])
m.drawcoastlines()
plt.colorbar()
plt.show()

Resources