cropped shapefile leads to different results in same extent - r

I cropped a shapefile to a smaller extent (my AOI - area of interest) - to work with a reduced working directory. During my workflow I rasterize the shapefile.
Here is my problem: I saved both of my shapefiles (smaller and bigger) to compare the rasterized results (which should be the same as the underlying raster has the same extent (AOI) as the smaller shapefile (obviously as well my AOI)).
But unfortunately they are not. CRS and number o cells are identical - but for example number of NAs not.
I did the same procedure and workflow with synthetic data and it worked perfectly - so the problem has to be my data maybe. Here is the dropboxlink where you can download the shapefile and the raster. https://www.dropbox.com/sh/btgt2rc7uzawtx5/AADJ2YrKOnPh8gM-PPF7rmIQa?dl=0
I leave you my code here:
#load shp files
setwd("C:/Users/.../R")
TESTshp<-readOGR(dsn="test_crop_dropbox", layer="boden_ebod_reproj")
extent(TESTshp)
setwd("C:/Users/.../R/test_crop_dropbox")
TESTraster<-raster("testraster.tif")
extent(TESTraster)
TESTshp_small <- crop(TESTshp, extent(TESTraster))
TESTrasterize<- rasterize(TESTshp, TESTraster, field="BodTyp_gen")
TESTrasterize_small<- rasterize(TESTshp_small, TESTraster, field="BodTyp_gen")
identical(TESTrasterize, TESTrasterize_small)
Do you have any suggestion what could be wrong?
Thanks a lot!

Related

How do I mask a multi-layer netCDF or raster with a single-layer shapefile?

I am currently working with daily precipitation data in netCDF format. The data's at a 4km resolution that covers the United States. However, I want to mask/clip the data with a much higher-resolution shapefile for a particular geographical region (about the size of a county). Ultimately, I want the output to be daily precipitation data, either at that high resolution or the original 4km resolution, for the much smaller area.
I've tried a couple different methods, with the most success using the following code:
prcp_2000 <- raster::brick('pr_2000.nc')
shapefile <- shapefile("polygon_combined.shp")
shapefile <- spTransform(shapefile, crs(prcp_2000))
prcp_2000 <- mask(prcp_2000, shapefile)
prcp_2000 <- crop(prcp_2000, shapefile)
outfile <- paste("prcp_","2000_","CS",".nc",sep="")
writeRaster(prcp_2000, outfile, overwrite=TRUE, format="CDF", varname="prcp", varunit="mm/day", longname="mm of precipitation per day", xname="lon", yname="lat", zname="day", zunit="days since 1900-01-01")
However, I keep getting nothing but infinities/negative infinities for prcp output, even though I'm still getting appropriate variable lengths otherwise (day=366, lat=24, lon=23). Am I missing something?
The problem is the starting shapefile, shapefile <- shapefile("polygon_combined.shp"). I imported "polygon_combined.shp" into QGIS to plot a base layer under it easily. The image I've attached shows that it starts off in the Gulf of Mexico. With a bad location to begin with, transforming it isn't going to save it later.
I don't know the origin of the data, so I have no idea how it was produced this way to start. One way to possibly fix this is get the data from a different source. The EPA makes an ecoregions product that has several levels of regions. I think you want the level IV data to get your region. You may still need to transform it, but it'll be located in the right place to begin with. https://www.epa.gov/eco-research/ecoregion-download-files-state-region-5#pane-47

Spatial extents of raster data and shapefile do not overlap, how to fix without sp transform?

Using a raster data set and a shapefile to find landscape metrics within buffers of raster. Sometimes the extents overlay and sometimes they do not overlap (using the same code). The extents are not the same despite using the spTransform function. How do I solve this? in Arc they do overlap.
library(raster)
library(rgdal)
CDL2020 <- raster("CDL_2020_38.tif")
Buffer_10 <- readOGR("LAND_ANALYSIS_3.0/10km_Buffer_2020.shp")
#Set to same projection
Buffer_10 <- spTransform(Buffer_10,proj4string(CDL2020))
Using this code I imported and set to the same projections, however they are still different extents.

Raster increase in size when reprojected in R and QGIS

I'm using a Land cover raster of North America which is publicly available here: https://open.canada.ca/data/en/dataset/4e615eae-b90c-420b-adee-2ca35896caf6
I clipped it in R to cover Québec/Labrador:
veg <- raster("CanadaLandcover2015/CAN_LC_2015_CAL.tif")
e <- extent(c(1000000, 2700000, 500000, 2700000))#all qc
veg_qc <- crop(veg, e)
The raster is originally in projection ESPG:3978 NAD83/Canada Atlas Lambert. I wanted it to be in lat and long to be able to extract the values to datapoints.
veg_qc2 <- projectRaster(veg_qc,crs="+proj=longlat +ellps=WGS84 +no_defs")
That single line took ~12 hours to run and took over 200 GB of Temp data. Worst, there was a warning (sorry did not copy it) and only half the raster showed.
So I decided to try with the function Wrap in QGIS. Although it worked perfectly, the output raster was 16 Gb! The original clipped raster was only 693 MB.
To make things worse, I need the value layer to be included in R, so I used:
veg_qc <- getValues(veg_qc)
And I get the following error:
Error: cannot allocate vector of size 31.0 Gb
Why does the raster get bigger when reprojected?
Would there be a way to compress the raster or reproject without that giant increase of data?
How can I add the values to a big raster layer?
Ultimately, I could clip the raster further with the mcp of my data. I could also reproject my data and my other rasters in EPSG:3978 (although I am wondering if my other rasters my end up as > 10GB too).
I just had the same issue this week. After a reprojection, from 40Mb to 350Mb. I found the answer in this link Huge file size after averaging two rasters.
What I did in QGIS (not R by this time) is that I opened the following function in the Manu bar: raster > conversion > translate, and in Advanced parameters > Additional command-line parameters, I added -co COMPRESS=LZW. This will compress using the LZW method. Also, according to this link and considering your dataset, you can use PACKBITS. I hope this helps you.

How to calculate area of polygons from a large shapefile

Summary:
I'm trying to calculate the area of a large number of polygons in R. I've read a few posts about how I might do this (Example #1 & Example #2) but the problem I'm having is that my shapefile is too large (1.7gb) to import. Given I can't import the file, I can't calculate the area of the polygons.
Extended Explanation:
I'm actually trying to calculate the area of properties in Victoria, Australia. The polygons represent these properties. I downloaded the simplified models 1 and 2 of VicMaps from Spatial Datamart for all of Victoria.
However, given the size of the shapefiles, I had to narrow my search to just one local government area (LGA) and calculated the polygon areas (just for testing). The shapefile was 15.5MB.
library(raster)
x <- shapefile("D:/Downloads/SDM616230/ll_gda94/shape/lga_polygon/ballarat/VMPROP/PROPERTY_PRIMARY_APPROVED.shp")
crs(x)
x$area_sqkm <- area(x) / 1000000
This worked but its not a practical solution to my problem given there's many LGAs in Victoria and I plan to eventually follow the same process for Queensland and NSW.
However, trying to load a larger shapefile doesn't work and results in the below error code "Error: memory exhausted (limit reached?)".
I've tried using readShapePoly, readogr, st_read and read_sf to get the large shapefile into R but they don't work. I think the file is just too large. I tried using a select query within read_sf in an effort to reduce the size of the file I was reading but that didn't work either. I've read online that I should seek to split the shapefile into just the data I need to reduce the size - but I have no idea how to do that.
Hope you can help.
Obviously the file is too big for a single box. I think the options then are either
1) split the files into smaller ones, process one by one. See
https://gis.stackexchange.com/questions/195508/split-a-shapefile-into-smaller-files-on-linux-command-line
2) use some dbms or data warehouse to do it, they do such batching automatically.

How can you crop raster layers in R in a batch and change projection

I was working with spatial data to get ready for analyses - I have a DEM at the desired extent of my study area, though I have ~39 other layers at the national scale (US). Is there a way to crop all of these 39 layers to the same extent as the DEM at once?
Also, I will be overlaying the output with other layers in a different projection. Is it possible to adjust the projection and pixel size of the output layers?
I am trying to use freeware as much as possible for my data manipulation...
I had the problem above, but have written a function in R to do all of this in a batch - see below. I had 39 climate data layers at the scale of the continental U.S. (from PRISM Climate Data group; http://www.prism.oregonstate.edu/), and wanted to clip them to the extent of a DEM, in southern California, reproject them, and export them for easy import and use with other layers in SAGA GIS. Below is the code, with an example of how it would be run, setting the working directory to the folder that has the layers that you want to crop, and only those layers.
During the processing, all data are stored in memory, so with huge datasets, it might get hung up because of lack of memory... that would probably be something that would be good to improve.
Also, a response on the R Forum provided a shorter, more elegant way to do it too: http://permalink.gmane.org/gmane.comp.lang.r.geo/18320
I hope somebody finds it useful!
#########################################
#BatchCrop Function ###
#by Mike Treglia, mtreglia#gmail.com ###
###Tested in R Version 3.0.0 (64-bit), using 'raster' version 2.1-48 and 'rgdal' version 0.8-10
########################################
#This function crops .asc raster files in working directory to extent of another layer (referred to here as 'reference' layer), converts to desired projection, and saves as new .asc files in the working directory. It is important that the original raster files and the reference layer are all in the same projection, though different pixel sizes are OK. The function can easily be modified to use other raster formats as well
#Note, Requires package 'raster'
#Function Arguments:
#'Reference' refers to name of the layer with the desired extent; 'OutName' represents the intended prefix for output files; 'OutPrj' represents the desired output projection; and 'OutRes' represents the desired Output Resolution
BatchCrop<-function(Reference,OutName,OutPrj,OutRes){
filenames <- list.files(pattern="*.asc", full.names=TRUE) #Extract list of file names from working directory
library(raster) #Calls 'raster' library
#Function 'f1' imports data listed in 'filenames' and assigns projection
f1<-function(x,z) {
y <- raster(x)
projection(y) <- CRS(z)
return(y)
}
import <- lapply(filenames,f1,projection(Reference))
cropped <- lapply(import,crop,Reference) #Crop imported layers to reference layer, argument 'x'
#Function 'f2' changes projectection of cropped layers
f2<-function(x,y) {
x<-projectRaster(x, crs=OutPrj, res=OutRes)
return(x)
}
output <- lapply(cropped,f2,OutPrj)
#Use a 'for' loop to iterate writeRaster function for all cropped layers
for(i in (1:max(length(filenames)))){ #
writeRaster(output[[i]],paste(deparse(substitute(OutName)), i), format='ascii')
}
}
#############################################
###Example Code using function 'BatchCrop'###
#############################################
#Data layers to be cropped downloaded from: http://www.prism.oregonstate.edu/products/matrix.phtml?vartype=tmax&view=maps [testing was done using 1981-2010 monthly and annual normals; can use any .asc layer within the bounds of the PRISM data, with projection as lat/long and GRS80]
#Set Working Directory where data to be cropped are stored
setwd("D:/GIS/PRISM/1981-2010/TMin")
#Import Reference Layer
reference<-raster("D:/GIS/California/Hab Suitability Files/10m_DEM/10m DEM asc/DEM_10m.asc")
#Set Projection for Reference Layer
projection(reference) <- CRS("+proj=longlat +ellps=GRS80")
#Run Function [desired projection is UTM, zone 11, WGS84; desired output resolution is 800m]
BatchCrop(Reference=reference,OutName=TMinCrop,OutPrj="+proj=utm +zone=11 +datum=WGS84",OutRes=800)

Resources